Conversation
The methods are supposed to return 0 on success, and the flag to enable low-power operation was inverted. Signed-off-by: Ronald Tschalär <ronald@innovation.ch>
The device must be accessed via the configured baudrate for all operations, or the operations will time out. Note that currently bcm_set_baudrate() always fails (the commands always return -EBUSY) and hence the operating baudrate is never actually set. But if it were, it would need to be the same as the init baudrate, so we keep setting oper_speed anyway. Signed-off-by: Ronald Tschalär <ronald@innovation.ch>
When the UART is part of an intel-lpss based mfd, the resulting device
tree has an extra level. Specifically we have the following layout (with
the /sys/devices/ prefix removed for brevity):
ACPI device node:
LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:94/BCM2E7C:00
Associated physical device node:
LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:94/BCM2E7C:00/physical_node -> pci0000:00/0000:00:1e.0/BCM2E7C:00/
UART device node:
pci0000:00/0000:00:1e.0/dw-apb-uart.2
UART TTY device node:
pci0000:00/0000:00:1e.0/dw-apb-uart.2/tty
This driver was assuming that physical device's parent is the UART, but
in this case the parent is the mfd device. We therefore check for both
cases.
Note that because this assumption appears in some user-space tools too,
it might be better to put the physical device node under the UART
instead of the mfd.
Signed-off-by: Ronald Tschalär <ronald@innovation.ch>
Commit dec2c92 introduced locks around the proto functions, using rwlock's, which do not allow sleeping while the locks are held. However, the proto functions in hci_bcm use mutexes and hence need to be able to sleep. Therefore this replaces the rwlock's with rw_semaphore's. Because the writes are very rare compared to the reads the percpu variant is used. Lastly, the locks in the tx_wakeup callback needed to be removed because that is called from an IRQ context. But since it doesn't actually call any proto functions, instead just queueing work, and the HCI_UART_PROTO_READY flag is checked (again) in the worker, this doesn't cause any problems. Signed-off-by: Ronald Tschalär <ronald@innovation.ch>
commit 55acdd9 upstream. Can be reproduced when running dlm_controld (tested on 4.4.x, 4.12.4): # seq 1 100 | xargs -P0 -n1 dlm_tool join # seq 1 100 | xargs -P0 -n1 dlm_tool leave misc_register fails due to duplicate sysfs entry, which causes dlm_device_register to free ls->ls_device.name. In dlm_device_deregister the name was freed again, causing memory corruption. According to the comment in dlm_device_deregister the name should've been set to NULL when registration fails, so this patch does that. sysfs: cannot create duplicate filename '/dev/char/10:1' ------------[ cut here ]------------ warning: cpu: 1 pid: 4450 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x56/0x70 modules linked in: msr rfcomm dlm ccm bnep dm_crypt uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev btusb media btrtl btbcm btintel bluetooth ecdh_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_hdmi irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel thinkpad_acpi pcbc nvram snd_seq_midi snd_seq_midi_event aesni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_rawmidi aes_x86_64 crypto_simd glue_helper snd_hda_intel snd_hda_codec cryptd intel_cstate arc4 snd_hda_core snd_seq snd_seq_device snd_hwdep iwldvm intel_rapl_perf mac80211 joydev input_leds iwlwifi serio_raw cfg80211 snd_pcm shpchp snd_timer snd mac_hid mei_me lpc_ich mei soundcore sunrpc parport_pc ppdev lp parport autofs4 i915 psmouse e1000e ahci libahci i2c_algo_bit sdhci_pci ptp drm_kms_helper sdhci pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops drm wmi video cpu: 1 pid: 4450 comm: dlm_test.exe not tainted 4.12.4-041204-generic hardware name: lenovo 232425u/232425u, bios g2et82ww (2.02 ) 09/11/2012 task: ffff96b0cbabe140 task.stack: ffffb199027d0000 rip: 0010:sysfs_warn_dup+0x56/0x70 rsp: 0018:ffffb199027d3c58 eflags: 00010282 rax: 0000000000000038 rbx: ffff96b0e2c49158 rcx: 0000000000000006 rdx: 0000000000000000 rsi: 0000000000000086 rdi: ffff96b15e24dcc0 rbp: ffffb199027d3c70 r08: 0000000000000001 r09: 0000000000000721 r10: ffffb199027d3c00 r11: 0000000000000721 r12: ffffb199027d3cd1 r13: ffff96b1592088f0 r14: 0000000000000001 r15: ffffffffffffffef fs: 00007f78069c0700(0000) gs:ffff96b15e240000(0000) knlgs:0000000000000000 cs: 0010 ds: 0000 es: 0000 cr0: 0000000080050033 cr2: 000000178625ed28 cr3: 0000000091d3e000 cr4: 00000000001406e0 call trace: sysfs_do_create_link_sd.isra.2+0x9e/0xb0 sysfs_create_link+0x25/0x40 device_add+0x5a9/0x640 device_create_groups_vargs+0xe0/0xf0 device_create_with_groups+0x3f/0x60 ? snprintf+0x45/0x70 misc_register+0x140/0x180 device_write+0x6a8/0x790 [dlm] __vfs_write+0x37/0x160 ? apparmor_file_permission+0x1a/0x20 ? security_file_permission+0x3b/0xc0 vfs_write+0xb5/0x1a0 sys_write+0x55/0xc0 ? sys_fcntl+0x5d/0xb0 entry_syscall_64_fastpath+0x1e/0xa9 rip: 0033:0x7f78083454bd rsp: 002b:00007f78069bbd30 eflags: 00000293 orig_rax: 0000000000000001 rax: ffffffffffffffda rbx: 0000000000000006 rcx: 00007f78083454bd rdx: 000000000000009c rsi: 00007f78069bee00 rdi: 0000000000000005 rbp: 00007f77f8000a20 r08: 000000000000fcf0 r09: 0000000000000032 r10: 0000000000000024 r11: 0000000000000293 r12: 00007f78069bde00 r13: 00007f78069bee00 r14: 000000000000000a r15: 00007f78069bbd70 code: 85 c0 48 89 c3 74 12 b9 00 10 00 00 48 89 c2 31 f6 4c 89 ef e8 2c c8 ff ff 4c 89 e2 48 89 de 48 c7 c7 b0 8e 0c a8 e8 41 e8 ed ff <0f> ff 48 89 df e8 00 d5 f4 ff 5b 41 5c 41 5d 5d c3 66 0f 1f 84 ---[ end trace 40412246357cc9e0 ]--- dlm: 59f24629-ae39-44e2-9030-397ebc2eda26: leaving the lockspace group... bug: unable to handle kernel null pointer dereference at 0000000000000001 ip: [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140 pgd 0 oops: 0000 [#1] smp modules linked in: dlm 8021q garp mrp stp llc openvswitch nf_defrag_ipv6 nf_conntrack libcrc32c iptable_filter dm_multipath crc32_pclmul dm_mod aesni_intel psmouse aes_x86_64 sg ablk_helper cryptd lrw gf128mul glue_helper i2c_piix4 nls_utf8 tpm_tis tpm isofs nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc xen_wdt ip_tables x_tables autofs4 hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi 8139too serio_raw ata_piix 8139cp mii uhci_hcd ehci_pci ehci_hcd libata scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod ipv6 cpu: 0 pid: 394 comm: systemd-udevd tainted: g w 4.4.0+0 #1 hardware name: xen hvm domu, bios 4.7.2-2.2 05/11/2017 task: ffff880002410000 ti: ffff88000243c000 task.ti: ffff88000243c000 rip: e030:[<ffffffff811a3b4a>] [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140 rsp: e02b:ffff88000243fd90 eflags: 00010202 rax: 0000000000000000 rbx: ffff8800029864d0 rcx: 000000000007b36c rdx: 000000000007b36b rsi: 00000000024000c0 rdi: ffff880036801c00 rbp: ffff88000243fdc0 r08: 0000000000018880 r09: 0000000000000054 r10: 000000000000004a r11: ffff880034ace6c0 r12: 00000000024000c0 r13: ffff880036801c00 r14: 0000000000000001 r15: ffffffff8118dcc2 fs: 00007f0ab77548c0(0000) gs:ffff880036e00000(0000) knlgs:0000000000000000 cs: e033 ds: 0000 es: 0000 cr0: 0000000080050033 cr2: 0000000000000001 cr3: 000000000332d000 cr4: 0000000000040660 stack: ffffffff8118dc90 ffff8800029864d0 0000000000000000 ffff88003430b0b0 ffff880034b78320 ffff88003430b0b0 ffff88000243fdf8 ffffffff8118dcc2 ffff8800349c6700 ffff8800029864d0 000000000000000b 00007f0ab7754b90 call trace: [<ffffffff8118dc90>] ? anon_vma_fork+0x60/0x140 [<ffffffff8118dcc2>] anon_vma_fork+0x92/0x140 [<ffffffff8107033e>] copy_process+0xcae/0x1a80 [<ffffffff8107128b>] _do_fork+0x8b/0x2d0 [<ffffffff81071579>] sys_clone+0x19/0x20 [<ffffffff815a30ae>] entry_syscall_64_fastpath+0x12/0x71 ] code: f6 75 1c 4c 89 fa 44 89 e6 4c 89 ef e8 a7 e4 00 00 41 f7 c4 00 80 00 00 49 89 c6 74 47 eb 32 49 63 45 20 48 8d 4a 01 4d 8b 45 00 <49> 8b 1c 06 4c 89 f0 65 49 0f c7 08 0f 94 c0 84 c0 74 ac 49 63 rip [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140 rsp <ffff88000243fd90> cr2: 0000000000000001 --[ end trace 70cb9fd1b164a0e8 ]-- Signed-off-by: Edwin Török <edvin.torok@citrix.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
Great job @roadrunner2! As to the first commit, the schematics of the MB8,1 call the line D'accord on the botched The occasional timeout on the first command sounds like a delay may be needed in As to the second commit, could you check where the The other two commits I need to take a closer look. |
|
@l1k Regarding the schematics etc, that looks correct: BTLP(1) puts the device into low power - see also the DSDT snippet I posted. The issue is that In any case I haven't actually tested suspend and resume yet - I just noticed the issue while looking at the code. Regarding EBUSY: I already tried adding up to 100ms delay after power up, but that didn't help. And I traced the EBUSY down to getting the bluetooth status code 0x0c (which gets translated to EBUSY by |
The issue is that bcm_apple_set_device_wake() gets called with enable false when going to sleep, and with enable true when waking up. Ergo enable == false => BLTP(1) and enable == true => BLTP(0). Unless I'm totally confused. Okay, thanks for the explanation. Apparently the inverse polarity of the pin is open coded in this driver, which is crap, normally this is declared in the platform firmware. Anyway, the hci_bcm driver is being migrated to the new serial device bus, i.e. there is no tty chardev anymore for the UART, invocation of hciattach is no longer necessary and the kernel automatically switches to N_HCI ldisc and probes the hci_bcm driver. This also solves the issue addressed by your third patch in this pull. An initial set of 9 patches by Hans de Goede was merged to bluetooth-next a day ago and there are 2 other patches needed by Frédéric Danis which are unmerged yet but are slated for merging via the ACPI subsystem. Johan Hovold came up with a few comments and objections today but nothing that can't be resolved. I'm in the process of rebasing onto these patches, actually I've just done that but I need to review and refine a little further and will likely push the new branch on Monday. Thanks for your patience. |
|
So I've pushed the hci_bcm_v1 branch today as an initial attempt to rebase on current bluetooth-next. Compile-tested only as I don't have the hardware. Be sure to enable |
|
Frédéric Danis posted a v2 of his patches so I rebased the hci_bcm_v1 branch on top of them. Sorry, we're shooting at a moving target here. |
|
Another day, another rebase. This is now based on Frédéric Danis' v3 plus a fix by Johan Hovold, all on top of current bluetooth-next. |
|
😃 I'll give it a whirl in the next couple days. One thing I was wondering: should the proto-locks patch be pushed for 4.14? I honestly don't see how this can be working at all, unless nobody is using |
…() returns
The cgroup_taskset structure within the larger cgroup_mgctx structure
is supposed to be used once and then discarded. That is not really the
case in the hotplug code path:
cpuset_hotplug_workfn()
- cgroup_transfer_tasks()
- cgroup_migrate()
- cgroup_migrate_add_task()
- cgroup_migrate_execute()
In this case, the cgroup_migrate() function is called multiple time
with the same cgroup_mgctx structure to transfer the tasks from
one cgroup to another one-by-one. The second time cgroup_migrate()
is called, the cgroup_taskset will be in an incorrect state and so
may cause the system to panic. For example,
[ 150.888410] Faulting instruction address: 0xc0000000001db648
[ 150.888414] Oops: Kernel access of bad area, sig: 11 [#1]
[ 150.888417] SMP NR_CPUS=2048
[ 150.888417] NUMA
[ 150.888419] pSeries
:
[ 150.888545] NIP [c0000000001db648] cpuset_can_attach+0x58/0x1b0
[ 150.888548] LR [c0000000001db638] cpuset_can_attach+0x48/0x1b0
[ 150.888551] Call Trace:
[ 150.888554] [c0000005f65cb940] [c0000000001db638] cpuset_can_attach+0x48/0x1b 0 (unreliable)
[ 150.888559] [c0000005f65cb9a0] [c0000000001cff04] cgroup_migrate_execute+0xc4/0x4b0
[ 150.888563] [c0000005f65cba20] [c0000000001d7d14] cgroup_transfer_tasks+0x1d4/0x370
[ 150.888568] [c0000005f65cbb70] [c0000000001ddcb0] cpuset_hotplug_workfn+0x710/0x8f0
[ 150.888572] [c0000005f65cbc80] [c00000000012032c] process_one_work+0x1ac/0x4d0
[ 150.888576] [c0000005f65cbd20] [c0000000001206f8] worker_thread+0xa8/0x5b0
[ 150.888580] [c0000005f65cbdc0] [c0000000001293f8] kthread+0x168/0x1b0
[ 150.888584] [c0000005f65cbe30] [c00000000000b368] ret_from_kernel_thread+0x5c/0x74
To allow reuse of the cgroup_mgctx structure, some fields in that
structure are now re-initialized at the end of cgroup_migrate_execute()
function call so that the structure can be reused again in a later
iteration without causing problem.
This bug was introduced in the commit e595cd7 ("group: track
migration context in cgroup_mgctx") in 4.11. This commit moves the
cgroup_taskset initialization out of cgroup_migrate(). The commit
10467270fb3 ("cgroup: don't call migration methods if there are no
tasks to migrate") helped, but did not completely resolve the problem.
Fixes: e595cd7 ("group: track migration context in cgroup_mgctx")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v4.11+
Xiaolong reported a suspicious rcu_dereference_check in the device unregister notifier callback. Since we do not dereference the rx_handler_data, it's ok to just check for the value of the pointer. Note that this section is already protected by rtnl_lock. [ 101.364846] WARNING: suspicious RCU usage [ 101.365654] 4.13.0-rc6-01701-gceed73a #1 Not tainted [ 101.370873] ----------------------------- [ 101.372472] drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c:57 suspicious rcu_dereference_check() usage! [ 101.374427] [ 101.374427] other info that might help us debug this: [ 101.374427] [ 101.387491] [ 101.387491] rcu_scheduler_active = 2, debug_locks = 1 [ 101.389368] 1 lock held by trinity-main/2809: [ 101.390736] #0: (rtnl_mutex){+.+.+.}, at: [<8146085b>] rtnl_lock+0xf/0x11 [ 101.395482] [ 101.395482] stack backtrace: [ 101.396948] CPU: 0 PID: 2809 Comm: trinity-main Not tainted 4.13.0-rc6-01701-gceed73a #1 [ 101.398857] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014 [ 101.401079] Call Trace: [ 101.401656] dump_stack+0xa1/0xeb [ 101.402871] lockdep_rcu_suspicious+0xc7/0xd0 [ 101.403665] rmnet_is_real_dev_registered+0x40/0x4e [ 101.405199] rmnet_config_notify_cb+0x2c/0x142 [ 101.406344] ? wireless_nlevent_flush+0x47/0x71 [ 101.407385] notifier_call_chain+0x2d/0x47 [ 101.408645] raw_notifier_call_chain+0xc/0xe [ 101.409882] call_netdevice_notifiers_info+0x41/0x49 [ 101.411402] call_netdevice_notifiers+0xc/0xe [ 101.412713] rollback_registered_many+0x268/0x36e [ 101.413702] rollback_registered+0x39/0x56 [ 101.414965] unregister_netdevice_queue+0x79/0x88 [ 101.415908] unregister_netdev+0x16/0x1d Fixes: ceed73a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The driver calls spi_get_drvdata() in its ->remove hook even though it has never called spi_set_drvdata(). Stack trace for posterity: Unable to handle kernel NULL pointer dereference at virtual address 00000220 Internal error: Oops: 5 [#1] SMP ARM [<8072f564>] (mutex_lock) from [<7f1400d0>] (iio_device_unregister+0x24/0x7c [industrialio]) [<7f1400d0>] (iio_device_unregister [industrialio]) from [<7f15e020>] (mcp320x_remove+0x20/0x30 [mcp320x]) [<7f15e020>] (mcp320x_remove [mcp320x]) from [<8055a8cc>] (spi_drv_remove+0x2c/0x44) [<8055a8cc>] (spi_drv_remove) from [<805087bc>] (__device_release_driver+0x98/0x134) [<805087bc>] (__device_release_driver) from [<80509180>] (driver_detach+0xdc/0xe0) [<80509180>] (driver_detach) from [<8050823c>] (bus_remove_driver+0x5c/0xb0) [<8050823c>] (bus_remove_driver) from [<80509ab0>] (driver_unregister+0x38/0x58) [<80509ab0>] (driver_unregister) from [<7f15e69c>] (mcp320x_driver_exit+0x14/0x1c [mcp320x]) [<7f15e69c>] (mcp320x_driver_exit [mcp320x]) from [<801a78d0>] (SyS_delete_module+0x184/0x1d0) [<801a78d0>] (SyS_delete_module) from [<80108100>] (ret_fast_syscall+0x0/0x1c) Fixes: f5ce4a7 ("iio: adc: add driver for MCP3204/08 12-bit ADC") Cc: Oskar Andero <oskar.andero@gmail.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Because keyctl_read_key() looks up the key with no permissions
requested, it may find a negatively instantiated key. If the key is
also possessed, we went ahead and called ->read() on the key. But the
key payload will actually contain the ->reject_error rather than the
normal payload. Thus, the kernel oopses trying to read the
user_key_payload from memory address (int)-ENOKEY = 0x00000000ffffff82.
Fortunately the payload data is stored inline, so it shouldn't be
possible to abuse this as an arbitrary memory read primitive...
Reproducer:
keyctl new_session
keyctl request2 user desc '' @s
keyctl read $(keyctl show | awk '/user: desc/ {print $1}')
It causes a crash like the following:
BUG: unable to handle kernel paging request at 00000000ffffff92
IP: user_read+0x33/0xa0
PGD 36a54067 P4D 36a54067 PUD 0
Oops: 0000 [#1] SMP
CPU: 0 PID: 211 Comm: keyctl Not tainted 4.14.0-rc1 #337
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
task: ffff90aa3b74c3c0 task.stack: ffff9878c0478000
RIP: 0010:user_read+0x33/0xa0
RSP: 0018:ffff9878c047bee8 EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff90aa3d7da340 RCX: 0000000000000017
RDX: 0000000000000000 RSI: 00000000ffffff82 RDI: ffff90aa3d7da340
RBP: ffff9878c047bf00 R08: 00000024f95da94f R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 00007f58ece69740(0000) GS:ffff90aa3e200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000ffffff92 CR3: 0000000036adc001 CR4: 00000000003606f0
Call Trace:
keyctl_read_key+0xac/0xe0
SyS_keyctl+0x99/0x120
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x7f58ec787bb9
RSP: 002b:00007ffc8d401678 EFLAGS: 00000206 ORIG_RAX: 00000000000000fa
RAX: ffffffffffffffda RBX: 00007ffc8d402800 RCX: 00007f58ec787bb9
RDX: 0000000000000000 RSI: 00000000174a63ac RDI: 000000000000000b
RBP: 0000000000000004 R08: 00007ffc8d402809 R09: 0000000000000020
R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffc8d402800
R13: 00007ffc8d4016e0 R14: 0000000000000000 R15: 0000000000000000
Code: e5 41 55 49 89 f5 41 54 49 89 d4 53 48 89 fb e8 a4 b4 ad ff 85 c0 74 09 80 3d b9 4c 96 00 00 74 43 48 8b b3 20 01 00 00 4d 85 ed <0f> b7 5e 10 74 29 4d 85 e4 74 24 4c 39 e3 4c 89 e2 4c 89 ef 48
RIP: user_read+0x33/0xa0 RSP: ffff9878c047bee8
CR2: 00000000ffffff92
Fixes: 61ea0c0 ("KEYS: Skip key state checks when checking for possession")
Cc: <stable@vger.kernel.org> [v3.13+]
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
…rse nlmsg properly ChunYu found a kernel crash by syzkaller: [ 651.617875] kasan: CONFIG_KASAN_INLINE enabled [ 651.618217] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 651.618731] general protection fault: 0000 [#1] SMP KASAN [ 651.621543] CPU: 1 PID: 9539 Comm: scsi Not tainted 4.11.0.cov #32 [ 651.621938] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 651.622309] task: ffff880117780000 task.stack: ffff8800a3188000 [ 651.622762] RIP: 0010:skb_release_data+0x26c/0x590 [...] [ 651.627260] Call Trace: [ 651.629156] skb_release_all+0x4f/0x60 [ 651.629450] consume_skb+0x1a5/0x600 [ 651.630705] netlink_unicast+0x505/0x720 [ 651.632345] netlink_sendmsg+0xab2/0xe70 [ 651.633704] sock_sendmsg+0xcf/0x110 [ 651.633942] ___sys_sendmsg+0x833/0x980 [ 651.637117] __sys_sendmsg+0xf3/0x240 [ 651.638820] SyS_sendmsg+0x32/0x50 [ 651.639048] entry_SYSCALL_64_fastpath+0x1f/0xc2 It's caused by skb_shared_info at the end of sk_buff was overwritten by ISCSI_KEVENT_IF_ERROR when parsing nlmsg info from skb in iscsi_if_rx. During the loop if skb->len == nlh->nlmsg_len and both are sizeof(*nlh), ev = nlmsg_data(nlh) will acutally get skb_shinfo(SKB) instead and set a new value to skb_shinfo(SKB)->nr_frags by ev->type. This patch is to fix it by checking nlh->nlmsg_len properly there to avoid over accessing sk_buff. Reported-by: ChunYu Wang <chunwang@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Guillaume Nault says: ==================== l2tp: fix some races in session deletion L2TP provides several interfaces for deleting sessions. Using two of them concurrently can lead to use-after-free bugs. Patch l1k#2 uses a flag to prevent double removal of L2TP sessions. Patch #1 fixes a bug found in the way. Fixing this bug is also necessary for patch l1k#2 to handle all cases. This issue is similar to the tunnel deletion bug being worked on by Sabrina: https://patchwork.ozlabs.org/patch/814173/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
ti-cpufreq and cpufreq-dt-platdev drivers are registering platform-device with same name "cpufreq-dt" using platform_device_register_*() routines. This is leading to build warnings appended below. Providing hardware information to OPP framework along with the platform- device creation should be done by ti-cpufreq driver before cpufreq-dt driver comes into place. This patch add's TI am33xx, am43 and dra7 platforms (which use opp-v2 property) to the blacklist of devices in cpufreq-dt-platform driver to avoid creating platform-device twice and remove build warnings. [ 2.370167] ------------[ cut here ]------------ [ 2.375087] WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x58/0x78 [ 2.383112] sysfs: cannot create duplicate filename '/devices/platform/cpufreq-dt' [ 2.391219] Modules linked in: [ 2.394506] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-next-20170912 #1 [ 2.402006] Hardware name: Generic AM33XX (Flattened Device Tree) [ 2.408437] [<c0110a28>] (unwind_backtrace) from [<c010ca84>] (show_stack+0x10/0x14) [ 2.416568] [<c010ca84>] (show_stack) from [<c0827d64>] (dump_stack+0xac/0xe0) [ 2.424165] [<c0827d64>] (dump_stack) from [<c0137470>] (__warn+0xd8/0x104) [ 2.431488] [<c0137470>] (__warn) from [<c01374d0>] (warn_slowpath_fmt+0x34/0x44) [ 2.439351] [<c01374d0>] (warn_slowpath_fmt) from [<c03459d0>] (sysfs_warn_dup+0x58/0x78) [ 2.447938] [<c03459d0>] (sysfs_warn_dup) from [<c0345ab8>] (sysfs_create_dir_ns+0x80/0x98) [ 2.456719] [<c0345ab8>] (sysfs_create_dir_ns) from [<c082c554>] (kobject_add_internal+0x9c/0x2d4) [ 2.466124] [<c082c554>] (kobject_add_internal) from [<c082c7d8>] (kobject_add+0x4c/0x9c) [ 2.474712] [<c082c7d8>] (kobject_add) from [<c05803e4>] (device_add+0xcc/0x57c) [ 2.482489] [<c05803e4>] (device_add) from [<c0584b74>] (platform_device_add+0x100/0x220) [ 2.491085] [<c0584b74>] (platform_device_add) from [<c05855a8>] (platform_device_register_full+0xf4/0x118) [ 2.501305] [<c05855a8>] (platform_device_register_full) from [<c067023c>] (ti_cpufreq_init+0x150/0x22c) [ 2.511253] [<c067023c>] (ti_cpufreq_init) from [<c0101df4>] (do_one_initcall+0x3c/0x170) [ 2.519838] [<c0101df4>] (do_one_initcall) from [<c0c00eb4>] (kernel_init_freeable+0x1fc/0x2c4) [ 2.528974] [<c0c00eb4>] (kernel_init_freeable) from [<c083bcac>] (kernel_init+0x8/0x110) [ 2.537565] [<c083bcac>] (kernel_init) from [<c0107d18>] (ret_from_fork+0x14/0x3c) [ 2.545981] ---[ end trace 2fc00e213c13ab20 ]--- [ 2.551051] ------------[ cut here ]------------ [ 2.555931] WARNING: CPU: 0 PID: 1 at lib/kobject.c:240 kobject_add_internal+0x254/0x2d4 [ 2.564578] kobject_add_internal failed for cpufreq-dt with -EEXIST, don't try to register things with the same name in the same directory. [ 2.577977] Modules linked in: [ 2.581261] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.13.0-next-20170912 #1 [ 2.590013] Hardware name: Generic AM33XX (Flattened Device Tree) [ 2.596437] [<c0110a28>] (unwind_backtrace) from [<c010ca84>] (show_stack+0x10/0x14) [ 2.604573] [<c010ca84>] (show_stack) from [<c0827d64>] (dump_stack+0xac/0xe0) [ 2.612172] [<c0827d64>] (dump_stack) from [<c0137470>] (__warn+0xd8/0x104) [ 2.619494] [<c0137470>] (__warn) from [<c01374d0>] (warn_slowpath_fmt+0x34/0x44) [ 2.627362] [<c01374d0>] (warn_slowpath_fmt) from [<c082c70c>] (kobject_add_internal+0x254/0x2d4) [ 2.636666] [<c082c70c>] (kobject_add_internal) from [<c082c7d8>] (kobject_add+0x4c/0x9c) [ 2.645255] [<c082c7d8>] (kobject_add) from [<c05803e4>] (device_add+0xcc/0x57c) [ 2.653027] [<c05803e4>] (device_add) from [<c0584b74>] (platform_device_add+0x100/0x220) [ 2.661615] [<c0584b74>] (platform_device_add) from [<c05855a8>] (platform_device_register_full+0xf4/0x118) [ 2.671833] [<c05855a8>] (platform_device_register_full) from [<c067023c>] (ti_cpufreq_init+0x150/0x22c) [ 2.681779] [<c067023c>] (ti_cpufreq_init) from [<c0101df4>] (do_one_initcall+0x3c/0x170) [ 2.690377] [<c0101df4>] (do_one_initcall) from [<c0c00eb4>] (kernel_init_freeable+0x1fc/0x2c4) [ 2.699510] [<c0c00eb4>] (kernel_init_freeable) from [<c083bcac>] (kernel_init+0x8/0x110) [ 2.708106] [<c083bcac>] (kernel_init) from [<c0107d18>] (ret_from_fork+0x14/0x3c) [ 2.716217] ---[ end trace 2fc00e213c13ab21 ]--- Fixes: edeec42 (cpufreq: dt-cpufreq: platdev Automatically create device with OPP v2) Signed-off-by: Suniel Mahesh <sunil.m@techveda.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The ARM short descriptor has already limited the physical address
to 32bit after the commit <76557391433c> ("iommu/io-pgtable: Sanitise
map/unmap addresses"). But in MediaTek 4GB mode, the physical address
is from 0x1_0000_0000 to 0x1_ffff_ffff. this will cause:
WARNING: CPU: 4 PID: 3900 at
xxx/drivers/iommu/io-pgtable-arm-v7s.c:482 arm_v7s_map+0x40/0xf8
Modules linked in:
CPU: 4 PID: 3900 Comm: weston Tainted: G S W 4.9.44 #1
Hardware name: MediaTek MT2712m1v1 board (DT)
task: ffffffc0eaa5b280 task.stack: ffffffc0e9858000
PC is at arm_v7s_map+0x40/0xf8
LR is at mtk_iommu_map+0x64/0x90
pc : [<ffffff80085b09e8>] lr : [<ffffff80085b29fc>] pstate: 000001c5
sp : ffffffc0e985b920
x29: ffffffc0e985b920 x28: 0000000127d00000
x27: 0000000000100000 x26: ffffff8008f9e000
x25: 0000000000000003 x24: 0000000000100000
x23: 0000000127d00000 x22: 00000000ff800000
x21: ffffffc0f7ec8ce0 x20: 0000000000000003
x19: 0000000000000003 x18: 0000000000000002
x17: 0000007f7e5d72c0 x16: ffffff80082b0f08
x15: 0000000000000001 x14: 000000000000003f
x13: 0000000000000000 x12: 0000000000000028
x11: 0088000000000000 x10: 0000000000000000
x9 : ffffff80092fa000 x8 : ffffffc0e9858000
x7 : ffffff80085b29d8 x6 : 0000000000000000
x5 : ffffff80085b09a8 x4 : 0000000000000003
x3 : 0000000000100000 x2 : 0000000127d00000
x1 : 00000000ff800000 x0 : 0000000000000001
...
Call trace:
[<ffffff80085b09e8>] arm_v7s_map+0x40/0xf8
[<ffffff80085b29fc>] mtk_iommu_map+0x64/0x90
[<ffffff80085ab5f8>] iommu_map+0x100/0x3a0
[<ffffff80085ab99c>] default_iommu_map_sg+0x104/0x168
[<ffffff80085aead8>] iommu_dma_alloc+0x238/0x3f8
[<ffffff8008098b30>] __iommu_alloc_attrs+0xa8/0x260
[<ffffff80085f364c>] mtk_drm_gem_create+0xac/0x180
[<ffffff80085f3894>] mtk_drm_gem_dumb_create+0x54/0xc8
[<ffffff80085d576c>] drm_mode_create_dumb_ioctl+0xa4/0xd8
[<ffffff80085cb2a0>] drm_ioctl+0x1c0/0x490
In order to satify this, Limit the physical address to 32bit.
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
…ping When bootup a PVM guest with large memory(Ex.240GB), XEN provided initial mapping overlaps with kernel module virtual space. When mapping in this space is cleared by xen_cleanhighmap(), in certain case there could be an 2MB mapping left. This is due to XEN initialize 4MB aligned mapping but xen_cleanhighmap() finish at 2MB boundary. When module loading is just on top of the 2MB space, got below warning: WARNING: at mm/vmalloc.c:106 vmap_pte_range+0x14e/0x190() Call Trace: [<ffffffff81117083>] warn_alloc_failed+0xf3/0x160 [<ffffffff81146022>] __vmalloc_area_node+0x182/0x1c0 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80 [<ffffffff81145df7>] __vmalloc_node_range+0xa7/0x110 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80 [<ffffffff8103ca54>] module_alloc+0x64/0x70 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80 [<ffffffff810ac91e>] module_alloc_update_bounds+0x1e/0x80 [<ffffffff810ac9a7>] move_module+0x27/0x150 [<ffffffff810aefa0>] layout_and_allocate+0x120/0x1b0 [<ffffffff810af0a8>] load_module+0x78/0x640 [<ffffffff811ff90b>] ? security_file_permission+0x8b/0x90 [<ffffffff810af6d2>] sys_init_module+0x62/0x1e0 [<ffffffff815154c2>] system_call_fastpath+0x16/0x1b Then the mapping of 2MB is cleared, finally oops when the page in that space is accessed. BUG: unable to handle kernel paging request at ffff880022600000 IP: [<ffffffff81260877>] clear_page_c_e+0x7/0x10 PGD 1788067 PUD 178c067 PMD 22434067 PTE 0 Oops: 0002 [#1] SMP Call Trace: [<ffffffff81116ef7>] ? prep_new_page+0x127/0x1c0 [<ffffffff81117d42>] get_page_from_freelist+0x1e2/0x550 [<ffffffff81133010>] ? ii_iovec_copy_to_user+0x90/0x140 [<ffffffff81119c9d>] __alloc_pages_nodemask+0x12d/0x230 [<ffffffff81155516>] alloc_pages_vma+0xc6/0x1a0 [<ffffffff81006ffd>] ? pte_mfn_to_pfn+0x7d/0x100 [<ffffffff81134cfb>] do_anonymous_page+0x16b/0x350 [<ffffffff81139c34>] handle_pte_fault+0x1e4/0x200 [<ffffffff8100712e>] ? xen_pmd_val+0xe/0x10 [<ffffffff810052c9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff81139dab>] handle_mm_fault+0x15b/0x270 [<ffffffff81510c10>] do_page_fault+0x140/0x470 [<ffffffff8150d7d5>] page_fault+0x25/0x30 Call xen_cleanhighmap() with 4MB aligned for page tables mapping to fix it. The unnecessory call of xen_cleanhighmap() in DEBUG mode is also removed. -v2: add comment about XEN alignment from Juergen. References: https://lists.xen.org/archives/html/xen-devel/2012-07/msg01562.html Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> [boris: added 'xen/mmu' tag to commit subject] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
We currently route pte translation faults via do_page_fault, which elides the address check against TASK_SIZE before invoking the mm fault handling code. However, this can cause issues with the path walking code in conjunction with our word-at-a-time implementation because load_unaligned_zeropad can end up faulting in kernel space if it reads across a page boundary and runs into a page fault (e.g. by attempting to read from a guard region). In the case of such a fault, load_unaligned_zeropad has registered a fixup to shift the valid data and pad with zeroes, however the abort is reported as a level 3 translation fault and we dispatch it straight to do_page_fault, despite it being a kernel address. This results in calling a sleeping function from atomic context: BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313 in_atomic(): 0, irqs_disabled(): 0, pid: 10290 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [...] [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144 [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c [<ffffff8e016977f0>] do_page_fault+0x140/0x330 [<ffffff8e01681328>] do_mem_abort+0x54/0xb0 Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0) [...] [<ffffff8e016844fc>] el1_da+0x18/0x78 [<ffffff8e017f399c>] path_parentat+0x44/0x88 [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8 [<ffffff8e017f5044>] filename_create+0x4c/0x128 [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8 [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28 Code: 36380080 d5384100 f9400800 9402566d (d4210000) ---[ end trace 2d01889f2bca9b9f ]--- Fix this by dispatching all translation faults to do_translation_faults, which avoids invoking the page fault logic for faults on kernel addresses. Cc: <stable@vger.kernel.org> Reported-by: Ankit Jain <ankijain@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The following lockdep splat has been noticed during LTP testing
======================================================
WARNING: possible circular locking dependency detected
4.13.0-rc3-next-20170807 #12 Not tainted
------------------------------------------------------
a.out/4771 is trying to acquire lock:
(cpu_hotplug_lock.rw_sem){++++++}, at: [<ffffffff812b4668>] drain_all_stock.part.35+0x18/0x140
but task is already holding lock:
(&mm->mmap_sem){++++++}, at: [<ffffffff8106eb35>] __do_page_fault+0x175/0x530
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> l1k#3 (&mm->mmap_sem){++++++}:
lock_acquire+0xc9/0x230
__might_fault+0x70/0xa0
_copy_to_user+0x23/0x70
filldir+0xa7/0x110
xfs_dir2_sf_getdents.isra.10+0x20c/0x2c0 [xfs]
xfs_readdir+0x1fa/0x2c0 [xfs]
xfs_file_readdir+0x30/0x40 [xfs]
iterate_dir+0x17a/0x1a0
SyS_getdents+0xb0/0x160
entry_SYSCALL_64_fastpath+0x1f/0xbe
-> l1k#2 (&type->i_mutex_dir_key#3){++++++}:
lock_acquire+0xc9/0x230
down_read+0x51/0xb0
lookup_slow+0xde/0x210
walk_component+0x160/0x250
link_path_walk+0x1a6/0x610
path_openat+0xe4/0xd50
do_filp_open+0x91/0x100
file_open_name+0xf5/0x130
filp_open+0x33/0x50
kernel_read_file_from_path+0x39/0x80
_request_firmware+0x39f/0x880
request_firmware_direct+0x37/0x50
request_microcode_fw+0x64/0xe0
reload_store+0xf7/0x180
dev_attr_store+0x18/0x30
sysfs_kf_write+0x44/0x60
kernfs_fop_write+0x113/0x1a0
__vfs_write+0x37/0x170
vfs_write+0xc7/0x1c0
SyS_write+0x58/0xc0
do_syscall_64+0x6c/0x1f0
return_from_SYSCALL_64+0x0/0x7a
-> #1 (microcode_mutex){+.+.+.}:
lock_acquire+0xc9/0x230
__mutex_lock+0x88/0x960
mutex_lock_nested+0x1b/0x20
microcode_init+0xbb/0x208
do_one_initcall+0x51/0x1a9
kernel_init_freeable+0x208/0x2a7
kernel_init+0xe/0x104
ret_from_fork+0x2a/0x40
-> #0 (cpu_hotplug_lock.rw_sem){++++++}:
__lock_acquire+0x153c/0x1550
lock_acquire+0xc9/0x230
cpus_read_lock+0x4b/0x90
drain_all_stock.part.35+0x18/0x140
try_charge+0x3ab/0x6e0
mem_cgroup_try_charge+0x7f/0x2c0
shmem_getpage_gfp+0x25f/0x1050
shmem_fault+0x96/0x200
__do_fault+0x1e/0xa0
__handle_mm_fault+0x9c3/0xe00
handle_mm_fault+0x16e/0x380
__do_page_fault+0x24a/0x530
do_page_fault+0x30/0x80
page_fault+0x28/0x30
other info that might help us debug this:
Chain exists of:
cpu_hotplug_lock.rw_sem --> &type->i_mutex_dir_key#3 --> &mm->mmap_sem
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&mm->mmap_sem);
lock(&type->i_mutex_dir_key#3);
lock(&mm->mmap_sem);
lock(cpu_hotplug_lock.rw_sem);
*** DEADLOCK ***
2 locks held by a.out/4771:
#0: (&mm->mmap_sem){++++++}, at: [<ffffffff8106eb35>] __do_page_fault+0x175/0x530
#1: (percpu_charge_mutex){+.+...}, at: [<ffffffff812b4c97>] try_charge+0x397/0x6e0
The problem is very similar to the one fixed by commit a459eeb
("mm, page_alloc: do not depend on cpu hotplug locks inside the
allocator"). We are taking hotplug locks while we can be sitting on top
of basically arbitrary locks. This just calls for problems.
We can get rid of {get,put}_online_cpus, fortunately. We do not have to
be worried about races with memory hotplug because drain_local_stock,
which is called from both the WQ draining and the memory hotplug
contexts, is always operating on the local cpu stock with IRQs disabled.
The only thing to be careful about is that the target memcg doesn't
vanish while we are still in drain_all_stock so take a reference on it.
Link: http://lkml.kernel.org/r/20170913090023.28322-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Artem Savkov <asavkov@redhat.com>
Tested-by: Artem Savkov <asavkov@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
printk_ratelimit() invokes ___ratelimit() which may invoke a normal
printk() (pr_warn() in this particular case) to warn about suppressed
output. Given that printk_ratelimit() may be called from anywhere, that
pr_warn() is dangerous - it may end up deadlocking the system. Fix
___ratelimit() by using deferred printk().
Sasha reported the following lockdep error:
: Unregister pv shared memory for cpu 8
: select_fallback_rq: 3 callbacks suppressed
: process 8583 (trinity-c78) no longer affine to cpu8
:
: ======================================================
: WARNING: possible circular locking dependency detected
: 4.14.0-rc2-next-20170927+ #252 Not tainted
: ------------------------------------------------------
: migration/8/62 is trying to acquire lock:
: (&port_lock_key){-.-.}, at: serial8250_console_write()
:
: but task is already holding lock:
: (&rq->lock){-.-.}, at: sched_cpu_dying()
:
: which lock already depends on the new lock.
:
:
: the existing dependency chain (in reverse order) is:
:
: -> l1k#3 (&rq->lock){-.-.}:
: __lock_acquire()
: lock_acquire()
: _raw_spin_lock()
: task_fork_fair()
: sched_fork()
: copy_process.part.31()
: _do_fork()
: kernel_thread()
: rest_init()
: start_kernel()
: x86_64_start_reservations()
: x86_64_start_kernel()
: verify_cpu()
:
: -> l1k#2 (&p->pi_lock){-.-.}:
: __lock_acquire()
: lock_acquire()
: _raw_spin_lock_irqsave()
: try_to_wake_up()
: default_wake_function()
: woken_wake_function()
: __wake_up_common()
: __wake_up_common_lock()
: __wake_up()
: tty_wakeup()
: tty_port_default_wakeup()
: tty_port_tty_wakeup()
: uart_write_wakeup()
: serial8250_tx_chars()
: serial8250_handle_irq.part.25()
: serial8250_default_handle_irq()
: serial8250_interrupt()
: __handle_irq_event_percpu()
: handle_irq_event_percpu()
: handle_irq_event()
: handle_level_irq()
: handle_irq()
: do_IRQ()
: ret_from_intr()
: native_safe_halt()
: default_idle()
: arch_cpu_idle()
: default_idle_call()
: do_idle()
: cpu_startup_entry()
: rest_init()
: start_kernel()
: x86_64_start_reservations()
: x86_64_start_kernel()
: verify_cpu()
:
: -> #1 (&tty->write_wait){-.-.}:
: __lock_acquire()
: lock_acquire()
: _raw_spin_lock_irqsave()
: __wake_up_common_lock()
: __wake_up()
: tty_wakeup()
: tty_port_default_wakeup()
: tty_port_tty_wakeup()
: uart_write_wakeup()
: serial8250_tx_chars()
: serial8250_handle_irq.part.25()
: serial8250_default_handle_irq()
: serial8250_interrupt()
: __handle_irq_event_percpu()
: handle_irq_event_percpu()
: handle_irq_event()
: handle_level_irq()
: handle_irq()
: do_IRQ()
: ret_from_intr()
: native_safe_halt()
: default_idle()
: arch_cpu_idle()
: default_idle_call()
: do_idle()
: cpu_startup_entry()
: rest_init()
: start_kernel()
: x86_64_start_reservations()
: x86_64_start_kernel()
: verify_cpu()
:
: -> #0 (&port_lock_key){-.-.}:
: check_prev_add()
: __lock_acquire()
: lock_acquire()
: _raw_spin_lock_irqsave()
: serial8250_console_write()
: univ8250_console_write()
: console_unlock()
: vprintk_emit()
: vprintk_default()
: vprintk_func()
: printk()
: ___ratelimit()
: __printk_ratelimit()
: select_fallback_rq()
: sched_cpu_dying()
: cpuhp_invoke_callback()
: take_cpu_down()
: multi_cpu_stop()
: cpu_stopper_thread()
: smpboot_thread_fn()
: kthread()
: ret_from_fork()
:
: other info that might help us debug this:
:
: Chain exists of:
: &port_lock_key --> &p->pi_lock --> &rq->lock
:
: Possible unsafe locking scenario:
:
: CPU0 CPU1
: ---- ----
: lock(&rq->lock);
: lock(&p->pi_lock);
: lock(&rq->lock);
: lock(&port_lock_key);
:
: *** DEADLOCK ***
:
: 4 locks held by migration/8/62:
: #0: (&p->pi_lock){-.-.}, at: sched_cpu_dying()
: #1: (&rq->lock){-.-.}, at: sched_cpu_dying()
: l1k#2: (printk_ratelimit_state.lock){....}, at: ___ratelimit()
: l1k#3: (console_lock){+.+.}, at: vprintk_emit()
:
: stack backtrace:
: CPU: 8 PID: 62 Comm: migration/8 Not tainted 4.14.0-rc2-next-20170927+ #252
: Call Trace:
: dump_stack()
: print_circular_bug()
: check_prev_add()
: ? add_lock_to_list.isra.26()
: ? check_usage()
: ? kvm_clock_read()
: ? kvm_sched_clock_read()
: ? sched_clock()
: ? check_preemption_disabled()
: __lock_acquire()
: ? __lock_acquire()
: ? add_lock_to_list.isra.26()
: ? debug_check_no_locks_freed()
: ? memcpy()
: lock_acquire()
: ? serial8250_console_write()
: _raw_spin_lock_irqsave()
: ? serial8250_console_write()
: serial8250_console_write()
: ? serial8250_start_tx()
: ? lock_acquire()
: ? memcpy()
: univ8250_console_write()
: console_unlock()
: ? __down_trylock_console_sem()
: vprintk_emit()
: vprintk_default()
: vprintk_func()
: printk()
: ? show_regs_print_info()
: ? lock_acquire()
: ___ratelimit()
: __printk_ratelimit()
: select_fallback_rq()
: sched_cpu_dying()
: ? sched_cpu_starting()
: ? rcutree_dying_cpu()
: ? sched_cpu_starting()
: cpuhp_invoke_callback()
: ? cpu_disable_common()
: take_cpu_down()
: ? trace_hardirqs_off_caller()
: ? cpuhp_invoke_callback()
: multi_cpu_stop()
: ? __this_cpu_preempt_check()
: ? cpu_stop_queue_work()
: cpu_stopper_thread()
: ? cpu_stop_create()
: smpboot_thread_fn()
: ? sort_range()
: ? schedule()
: ? __kthread_parkme()
: kthread()
: ? sort_range()
: ? kthread_create_on_node()
: ret_from_fork()
: process 9121 (trinity-c78) no longer affine to cpu8
: smpboot: CPU 8 is now offline
Link: http://lkml.kernel.org/r/20170928120405.18273-1-sergey.senozhatsky@gmail.com
Fixes: 6b1d174 ("ratelimit: extend to print suppressed messages on release")
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Borislav Petkov <bp@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
flush_tlb_kernel_range() may call smp_call_function_many() which expects interrupts to be enabled. This results in a traceback. WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1 task: cf830000 task.stack: cf82e000 NIP: c00a93c8 LR: c00a9634 CTR: 00000001 REGS: cf82fde0 TRAP: 0700 Not tainted (4.14.0-rc1-00009-g0666f56) MSR: 00021000 <CE,ME> CR: 24000082 XER: 00000000 GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001 GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c051000 GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000 NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc LR [c00a9634] smp_call_function+0x3c/0x50 Call Trace: [cf82fe90] [00000010] 0x10 (unreliable) [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50 [cf82fee0] [c0015d2] flush_tlb_kernel_range+0x20/0x38 [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c [cf82ff20] [c001484c] free_initmem+0x20/0x4c [cf82ff30] [c000316c] kernel_init+0x1c/0x108 [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78 Fixes: 3184cc4 ("powerpc/mm: Fix kernel RAM protection after freeing ...") Fixes: e611939 ("powerpc/mm: Ensure change_page_attr() doesn't ...") Cc: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
It works! 🎉 Device got automatically attached. Great work! FYI here is all related dmesg output: |
|
I can confirm the patches work on the latest linux-next tree. Thanks for your work! PS @l1k if you don't mind, could you change my cc email in the commit to peteryuchuang at gmail dot com? Thanks. |
|
Thanks a lot for testing! I'll have to do a bit more code reading and analysis of @roadrunner2's dmesg output to finalize the patch. The issue that a timeout occurs, does this still exist? With serdev, you can no longer stop and restart hciattach to work around it... @roadrunner2: About the proto-locks patch: I've realized only now that I can't really connect the proto-locks patch with the stacktraces you've posted above: They refer to a So, looking at the second stacktrace, we can see that @peterychuang: Sure, I've updated your e-mail address and pushed the branch, again rebased on current bluetooth-next. |
|
@l1k Regarding the locking: the mutex is in Regarding Now, AFAICT we can't replace the mutex in I looked at So for bluetooth-next it appears the lock patch isn't strictly necessary, but for 4.14? Lastly, regarding the timeout, no, haven't seen it yet, but haven't done much testing - will bang at it some more and let you know. |
commit c95072b upstream. While line6_probe() may kick off URB for a control MIDI endpoint, the function doesn't clean up it properly at its error path. This results in a leftover URB action that is eventually triggered later and causes an Oops like: general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 0 Comm: swapper/1 Not tainted RIP: 0010:usb_fill_bulk_urb ./include/linux/usb.h:1619 RIP: 0010:line6_start_listen+0x3fe/0x9e0 sound/usb/line6/driver.c:76 Call Trace: <IRQ> line6_data_received+0x1f7/0x470 sound/usb/line6/driver.c:326 __usb_hcd_giveback_urb+0x2e0/0x650 drivers/usb/core/hcd.c:1779 usb_hcd_giveback_urb+0x337/0x420 drivers/usb/core/hcd.c:1845 dummy_timer+0xba9/0x39f0 drivers/usb/gadget/udc/dummy_hcd.c:1965 call_timer_fn+0x2a2/0x940 kernel/time/timer.c:1281 .... Since the whole clean-up procedure is done in line6_disconnect() callback, we can simply call it in the error path instead of open-coding the whole again. It'll fix such an issue automagically. The bug was spotted by syzkaller. Fixes: eedd0e9 ("ALSA: line6: Don't forget to call driver's destructor at error path") Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…/git/vgupta/arc Pull ARC updates from Vineet Gupta: - Support for HSDK board hosting a Quad core HS38x4 based SoC running @1GHZ (and some prerrquisite changes such as ability to scoot the kernel code/data from start of memory map etc) - Quite a few updates for EZChip (Mellanox) platform - Fixes to fault/exception printing * tag 'arc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (26 commits) ARC: Re-enable MMU upon Machine Check exception ARC: Show fault information passed to show_kernel_fault_diag() ARC: [plat-hsdk] initial port for HSDK board ARC: mm: Decouple RAM base address from kernel link address ARCv2: IOC: Tighten up the contraints (specifically base / size alignment) ARC: [plat-axs103] refactor the DT fudging code ARC: [plat-axs103] use clk driver l1k#2: Add core pll node to DT to manage cpu clk ARC: [plat-axs103] use clk driver #1: Get rid of platform specific cpu clk setting ARCv2: SLC: provide a line based flush routine for debugging ARC: Hardcode ARCH_DMA_MINALIGN to max line length we may have ARC: [plat-eznps] handle extra aux regs l1k#2: kernel/entry exit ARC: [plat-eznps] handle extra aux regs #1: save/restore on context switch ARC: [plat-eznps] avoid toggling of DPC register ARC: [plat-eznps] Update the init sequence of aux regs per cpu. ARC: [plat-eznps] new command line argument for HW scheduler at MTM ARC: set boot print log level to PR_INFO ARC: [plat-eznps] Handle user memory error same in simulation and silicon ARC: [plat-eznps] use schd.wft instruction instead of sleep at idle task ARC: create cpu specific version of arch_cpu_idle() ARC: [plat-eznps] spinlock aware for MTM ...
For the same reasons we already cache the leftmost pointer, apply the same optimization for rb_last() calls. Users must explicitly do this as rb_root_cached only deals with the smallest node. [dave@stgolabs.net: brain fart #1] Link: http://lkml.kernel.org/r/20170731155955.GD21328@linux-80c1.suse Link: http://lkml.kernel.org/r/20170719014603.19029-18-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
do_xdp_generic must be called inside rcu critical section with preempt
disabled to ensure BPF programs are valid and per-cpu variables used
for redirect operations are consistent. This patch ensures this is true
and fixes the splat below.
The netif_receive_skb_internal() code path is now broken into two rcu
critical sections. I decided it was better to limit the preempt_enable/disable
block to just the xdp static key portion and the fallout is more
rcu_read_lock/unlock calls. Seems like the best option to me.
[ 607.596901] =============================
[ 607.596906] WARNING: suspicious RCU usage
[ 607.596912] 4.13.0-rc4+ #570 Not tainted
[ 607.596917] -----------------------------
[ 607.596923] net/core/dev.c:3948 suspicious rcu_dereference_check() usage!
[ 607.596927]
[ 607.596927] other info that might help us debug this:
[ 607.596927]
[ 607.596933]
[ 607.596933] rcu_scheduler_active = 2, debug_locks = 1
[ 607.596938] 2 locks held by pool/14624:
[ 607.596943] #0: (rcu_read_lock_bh){......}, at: [<ffffffff95445ffd>] ip_finish_output2+0x14d/0x890
[ 607.596973] #1: (rcu_read_lock_bh){......}, at: [<ffffffff953c8e3a>] __dev_queue_xmit+0x14a/0xfd0
[ 607.597000]
[ 607.597000] stack backtrace:
[ 607.597006] CPU: 5 PID: 14624 Comm: pool Not tainted 4.13.0-rc4+ #570
[ 607.597011] Hardware name: Dell Inc. Precision Tower 5810/0HHV7N, BIOS A17 03/01/2017
[ 607.597016] Call Trace:
[ 607.597027] dump_stack+0x67/0x92
[ 607.597040] lockdep_rcu_suspicious+0xdd/0x110
[ 607.597054] do_xdp_generic+0x313/0xa50
[ 607.597068] ? time_hardirqs_on+0x5b/0x150
[ 607.597076] ? mark_held_locks+0x6b/0xc0
[ 607.597088] ? netdev_pick_tx+0x150/0x150
[ 607.597117] netif_rx_internal+0x205/0x3f0
[ 607.597127] ? do_xdp_generic+0xa50/0xa50
[ 607.597144] ? lock_downgrade+0x2b0/0x2b0
[ 607.597158] ? __lock_is_held+0x93/0x100
[ 607.597187] netif_rx+0x119/0x190
[ 607.597202] loopback_xmit+0xfd/0x1b0
[ 607.597214] dev_hard_start_xmit+0x127/0x4e0
Fixes: d445516 ("net: xdp: support xdp generic on virtual devices")
Fixes: b5cdae3 ("net: Generic XDP")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
__get_request() can call trace_block_getrq() with bio=NULL which causes block_get_rq::TP_fast_assign() to deref a NULL pointer and panic. Syzkaller fuzzer panics with linux-next (1d53d90): kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 0 PID: 2983 Comm: syzkaller401111 Not tainted 4.13.0-rc7-next-20170901+ #13 task: ffff8801cf1da000 task.stack: ffff8801ce440000 RIP: 0010:perf_trace_block_get_rq+0x697/0x970 include/trace/events/block.h:384 RSP: 0018:ffff8801ce4473f0 EFLAGS: 00010246 RAX: ffff8801cf1da000 RBX: 1ffff10039c88e84 RCX: 1ffffd1ffff84d27 RDX: dffffc0000000001 RSI: 1ffff1003b643e7a RDI: ffffe8ffffc26938 RBP: ffff8801ce447530 R08: 1ffff1003b643e6c R09: ffffe8ffffc26964 R10: 0000000000000002 R11: fffff91ffff84d2d R12: ffffe8ffffc1f890 R13: ffffe8ffffc26930 R14: ffffffff85cad9e0 R15: 0000000000000000 FS: 0000000002641880(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000043e670 CR3: 00000001d1d7a000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: trace_block_getrq include/trace/events/block.h:423 [inline] __get_request block/blk-core.c:1283 [inline] get_request+0x1518/0x23b0 block/blk-core.c:1355 blk_old_get_request block/blk-core.c:1402 [inline] blk_get_request+0x1d8/0x3c0 block/blk-core.c:1427 sg_scsi_ioctl+0x117/0x750 block/scsi_ioctl.c:451 sg_ioctl+0x192d/0x2ed0 drivers/scsi/sg.c:1070 vfs_ioctl fs/ioctl.c:45 [inline] do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:685 SYSC_ioctl fs/ioctl.c:700 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 entry_SYSCALL_64_fastpath+0x1f/0xbe block_get_rq::TP_fast_assign() has multiple redundant ->dev assignments. Only one of them is NULL tolerant. Favor the NULL tolerant one. Fixes: 74d4699 ("block: replace bi_bdev with a gendisk pointer and partitions index") Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Thelen <gthelen@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Aneesh Kumar reported seeing host crashes when running recent kernels on POWER8. The symptom was an oops like this: Unable to handle kernel paging request for data at address 0xf00000000786c620 Faulting instruction address: 0xc00000000030e1e4 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: powernv_op_panel CPU: 24 PID: 6663 Comm: qemu-system-ppc Tainted: G W 4.13.0-rc7-43932-gfc36c59 l1k#2 task: c000000fdeadfe80 task.stack: c000000fdeb68000 NIP: c00000000030e1e4 LR: c00000000030de6c CTR: c000000000103620 REGS: c000000fdeb6b450 TRAP: 0300 Tainted: G W (4.13.0-rc7-43932-gfc36c59) MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24044428 XER: 20000000 CFAR: c00000000030e134 DAR: f00000000786c620 DSISR: 40000000 SOFTE: 0 GPR00: 0000000000000000 c000000fdeb6b6d0 c0000000010bd000 000000000000e1b0 GPR04: c00000000115e168 c000001fffa6e4b0 c00000000115d000 c000001e1b180386 GPR08: f000000000000000 c000000f9a8913e0 f00000000786c600 00007fff587d0000 GPR12: c000000fdeb68000 c00000000fb0f000 0000000000000001 00007fff587cffff GPR16: 0000000000000000 c000000000000000 00000000003fffff c000000fdebfe1f8 GPR20: 0000000000000004 c000000fdeb6b8a8 0000000000000001 0008000000000040 GPR24: 07000000000000c0 00007fff587cffff c000000fdec20bf8 00007fff587d0000 GPR28: c000000fdeca9ac0 00007fff587d0000 00007fff587c0000 00007fff587d0000 NIP [c00000000030e1e4] __get_user_pages_fast+0x434/0x1070 LR [c00000000030de6c] __get_user_pages_fast+0xbc/0x1070 Call Trace: [c000000fdeb6b6d0] [c00000000139dab8] lock_classes+0x0/0x35fe50 (unreliable) [c000000fdeb6b7e0] [c00000000030ef38] get_user_pages_fast+0xf8/0x120 [c000000fdeb6b830] [c000000000112318] kvmppc_book3s_hv_page_fault+0x308/0xf30 [c000000fdeb6b960] [c00000000010e10c] kvmppc_vcpu_run_hv+0xfdc/0x1f00 [c000000fdeb6bb20] [c0000000000e915c] kvmppc_vcpu_run+0x2c/0x40 [c000000fdeb6bb40] [c0000000000e5650] kvm_arch_vcpu_ioctl_run+0x110/0x300 [c000000fdeb6bbe0] [c0000000000d6468] kvm_vcpu_ioctl+0x528/0x900 [c000000fdeb6bd40] [c0000000003bc04c] do_vfs_ioctl+0xcc/0x950 [c000000fdeb6bde0] [c0000000003bc930] SyS_ioctl+0x60/0x100 [c000000fdeb6be30] [c00000000000b96c] system_call+0x58/0x6c Instruction dump: 7ca81a14 2fa50000 41de0010 7cc8182a 68c60002 78c6ffe2 0b060000 3cc2000a 794a3664 390610d8 e9080000 7d485214 <e90a0020> 7d435378 790507e1 408202f0 ---[ end trace fad4a342d0414aa2 ]--- It turns out that what has happened is that the SLB entry for the vmmemap region hasn't been reloaded on exit from a guest, and it has the wrong page size. Then, when the host next accesses the vmemmap region, it gets a page fault. Commit a25bd72 ("powerpc/mm/radix: Workaround prefetch issue with KVM", 2017-07-24) modified the guest exit code so that it now only clears out the SLB for hash guest. The code tests the radix flag and puts the result in a non-volatile CR field, CR2, and later branches based on CR2. Unfortunately, the kvmppc_save_tm function, which gets called between those two points, modifies all the user-visible registers in the case where the guest was in transactional or suspended state, except for a few which it restores (namely r1, r2, r9 and r13). Thus the hash/radix indication in CR2 gets corrupted. This fixes the problem by re-doing the comparison just before the result is needed. For good measure, this also adds comments next to the call sites of kvmppc_save_tm and kvmppc_restore_tm pointing out that non-volatile register state will be lost. Cc: stable@vger.kernel.org # v4.13 Fixes: a25bd72 ("powerpc/mm/radix: Workaround prefetch issue with KVM") Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Since commit dc749a0 ("gpiolib: allow gpio irqchip to map irqs dynamically"), the irqs for gpio are not statically allocated during in gpiochip_irqchip_add. This driver was based on this assumption for initializing the mask associated to each interrupt this led to a NULL pointer crash in the kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000 Mem abort info: Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000068 CM = 0, WnR = 1 [0000000000000000] user address but active_mm is swapper Internal error: Oops: 96000044 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-06657-g3b9f8ed25dbe #576 Hardware name: Marvell Armada 3720 Development Board DB-88F3720-DDR3 (DT) task: ffff80001d908000 task.stack: ffff000008068000 PC is at armada_37xx_pinctrl_probe+0x5f8/0x670 LR is at armada_37xx_pinctrl_probe+0x5e8/0x670 pc : [<ffff000008e25cdc>] lr : [<ffff000008e25ccc>] pstate: 60000045 sp : ffff00000806bb80 x29: ffff00000806bb80 x28: 0000000000000024 x27: 000000000000000c x26: 0000000000000001 x25: ffff80001efee760 x24: 0000000000000000 x23: ffff80001db6f570 x22: ffff80001db6f438 x21: 0000000000000000 x20: ffff80001d9f4810 x19: ffff80001db6f418 x18: 0000000000000000 x17: 0000000000000001 x16: 0000000000000019 x15: ffffffffffffffff x14: 0140000000000000 x13: 0000000000000000 x12: 0000000000000030 x11: 0101010101010101 x10: 0000000000000040 x9 : ffff000009923580 x8 : ffff80001d400248 x7 : ffff80001d400270 x6 : 0000000000000000 x5 : ffff80001d400248 x4 : ffff80001d400270 x3 : 0000000000000000 x2 : 0000000000000001 x1 : 0000000000000001 x0 : 0000000000000000 Process swapper/0 (pid: 1, stack limit = 0xffff000008068000) Call trace: Exception stack(0xffff00000806ba40 to 0xffff00000806bb80) ba40: 0000000000000000 0000000000000001 0000000000000001 0000000000000000 ba60: ffff80001d400270 ffff80001d400248 0000000000000000 ffff80001d400270 ba80: ffff80001d400248 ffff000009923580 0000000000000040 0101010101010101 baa0: 0000000000000030 0000000000000000 0140000000000000 ffffffffffffffff bac0: 0000000000000019 0000000000000001 0000000000000000 ffff80001db6f418 bae0: ffff80001d9f4810 0000000000000000 ffff80001db6f438 ffff80001db6f570 bb00: 0000000000000000 ffff80001efee760 0000000000000001 000000000000000c bb20: 0000000000000024 ffff00000806bb80 ffff000008e25ccc ffff00000806bb80 bb40: ffff000008e25cdc 0000000060000045 ffff00000806bb60 ffff0000081189b8 bb60: ffffffffffffffff ffff00000811cf1c ffff00000806bb80 ffff000008e25cdc [<ffff000008e25cdc>] armada_37xx_pinctrl_probe+0x5f8/0x670 [<ffff00000859d8c8>] platform_drv_probe+0x58/0xb8 [<ffff00000859bb44>] driver_probe_device+0x22c/0x2d8 [<ffff00000859bcac>] __driver_attach+0xbc/0xc0 [<ffff000008599c84>] bus_for_each_dev+0x4c/0x98 [<ffff00000859b440>] driver_attach+0x20/0x28 [<ffff00000859af90>] bus_add_driver+0x1b8/0x228 [<ffff00000859c648>] driver_register+0x60/0xf8 [<ffff00000859df64>] __platform_driver_probe+0x74/0x130 [<ffff000008e256dc>] armada_37xx_pinctrl_driver_init+0x20/0x28 [<ffff000008083980>] do_one_initcall+0x38/0x128 [<ffff000008e00cf4>] kernel_init_freeable+0x188/0x22c [<ffff0000089b56e8>] kernel_init+0x10/0x100 [<ffff000008084bb0>] ret_from_fork+0x10/0x18 Code: f9403fa2 12001341 1100075a 9ac12041 (b9000001) ---[ end trace 8b0f4e05e1603208 ]--- This patch moves the initialization of the mask field in the irq_startup function. However some callbacks such as irq_set_type and irq_set_wake could be called before irq_startup. For those functions the mask is computed at each call which is not a issue as these functions are not located in a hot path but are used sporadically for configuration. Fixes: dc749a0 ("gpiolib: allow gpio irqchip to map irqs dynamically") Cc: <stable@vger.kernel.org> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
…dline If ipv6 has been disabled from cmdline since kernel started, it makes no sense to allow users to create any ip6 tunnel. Otherwise, it could some potential problem. Jianlin found a kernel crash caused by this in ip6_gre when he set ipv6.disable=1 in grub: [ 209.588865] Unable to handle kernel paging request for data at address 0x00000080 [ 209.588872] Faulting instruction address: 0xc000000000a3aa6c [ 209.588879] Oops: Kernel access of bad area, sig: 11 [#1] [ 209.589062] NIP [c000000000a3aa6c] fib_rules_lookup+0x4c/0x260 [ 209.589071] LR [c000000000b9ad90] fib6_rule_lookup+0x50/0xb0 [ 209.589076] Call Trace: [ 209.589097] fib6_rule_lookup+0x50/0xb0 [ 209.589106] rt6_lookup+0xc4/0x110 [ 209.589116] ip6gre_tnl_link_config+0x214/0x2f0 [ip6_gre] [ 209.589125] ip6gre_newlink+0x138/0x3a0 [ip6_gre] [ 209.589134] rtnl_newlink+0x798/0xb80 [ 209.589142] rtnetlink_rcv_msg+0xec/0x390 [ 209.589151] netlink_rcv_skb+0x138/0x150 [ 209.589159] rtnetlink_rcv+0x48/0x70 [ 209.589169] netlink_unicast+0x538/0x640 [ 209.589175] netlink_sendmsg+0x40c/0x480 [ 209.589184] ___sys_sendmsg+0x384/0x4e0 [ 209.589194] SyS_sendmsg+0xd4/0x140 [ 209.589201] SyS_socketcall+0x3e0/0x4f0 [ 209.589209] system_call+0x38/0xe0 This patch is to return -EOPNOTSUPP in ip6_tunnel_init if ipv6 has been disabled from cmdline. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The "Release:" field of the spec file is determined based on the .version file. However, the .version file is not copied to the source tar file. So, when we build the kernel from the source package, the UTS_VERSION always indicates #1. This does not match with "rpm -q". The kernel UTS_VERSION and "rpm -q" do not agree for binrpm-pkg, either. Please note the kernel has already been built before the spec file is created. Currently, mkspec invokes mkversion. This script returns an incremented version. So, the "Release:" field of the spec file is greater than the version in the kernel by one. For the source package build (where .version file is missing), we can give KBUILD_BUILD_VERSION=%{release} to the build command. For the binary package build, we can simply read out the .version file because it contains the version number that was used for building the kernel image. We can remove scripts/mkversion because scripts/package/Makefile need not touch the .version file. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
…LPAR Commit 215ee76 ("powerpc: pseries: remove dlpar_attach_node dependency on full path") reworked dlpar_attach_node() to no longer look up the parent node "/cpus", but instead to have the parent node passed by the caller in the function parameter list. As a result dlpar_attach_node() is no longer responsible for freeing the reference to the parent node. However, commit 215ee76 failed to remove the of_node_put(parent) call in dlpar_attach_node(), or to take into account that the reference to the parent in the caller dlpar_cpu_add() needs to be held until after dlpar_attach_node() returns. As a result doing repeated cpu add/remove dlpar operations will eventually result in the following error: OF: ERROR: Bad of_node_put() on /cpus CPU: 0 PID: 10896 Comm: drmgr Not tainted 4.13.0-autotest #1 Call Trace: dump_stack+0x15c/0x1f8 (unreliable) of_node_release+0x1a4/0x1c0 kobject_put+0x1a8/0x310 kobject_del+0xbc/0xf0 __of_detach_node_sysfs+0x144/0x210 of_detach_node+0xf0/0x180 dlpar_detach_node+0xc4/0x120 dlpar_cpu_remove+0x280/0x560 dlpar_cpu_release+0xbc/0x1b0 arch_cpu_release+0x6c/0xb0 cpu_release_store+0xa0/0x100 dev_attr_store+0x68/0xa0 sysfs_kf_write+0xa8/0xf0 kernfs_fop_write+0x2cc/0x400 __vfs_write+0x5c/0x340 vfs_write+0x1a8/0x3d0 SyS_write+0xa8/0x1a0 system_call+0x58/0x6c Fix the issue by removing the of_node_put(parent) call from dlpar_attach_node(), and ensuring that the reference to the parent node is properly held and released by the caller dlpar_cpu_add(). Fixes: 215ee76 ("powerpc: pseries: remove dlpar_attach_node dependency on full path") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> [mpe: Add a comment in the code and frob the change log slightly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Build with the latest patches resulted in panic:
11384.486289] BUG: unable to handle kernel NULL pointer dereference at
(null)
[11384.486293] IP: (null)
[11384.486295] PGD 0
[11384.486295] P4D 0
[11384.486296]
[11384.486299] Oops: 0010 [#1] SMP
......... snip ......
[11384.486401] CPU: 0 PID: 968 Comm: kworker/0:1H Tainted: G W O
4.13.0-a-stream-20170825 #1
[11384.486402] Hardware name: Intel Corporation S2600WT2R/S2600WT2R,
BIOS SE5C610.86B.01.01.0014.121820151719 12/18/2015
[11384.486418] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[11384.486419] task: ffff880850579680 task.stack: ffffc90007fec000
[11384.486420] RIP: 0010: (null)
[11384.486420] RSP: 0018:ffffc90007fef970 EFLAGS: 00010206
[11384.486421] RAX: ffff88084cfe8000 RBX: ffff88084dce4000 RCX:
ffffc90007fef978
[11384.486422] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
ffff88084cfe8000
[11384.486422] RBP: ffffc90007fefab0 R08: 0000000000000000 R09:
ffff88084dce4080
[11384.486423] R10: ffffffffa02d7f60 R11: 0000000000000000 R12:
ffff88105af65a00
[11384.486423] R13: ffff88084dce4000 R14: 000000000000c000 R15:
000000000000c000
[11384.486424] FS: 0000000000000000(0000) GS:ffff88085f400000(0000)
knlGS:0000000000000000
[11384.486425] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11384.486425] CR2: 0000000000000000 CR3: 0000000001c09000 CR4:
00000000001406f0
[11384.486426] Call Trace:
[11384.486431] ? is_valid_mcast_lid.isra.21+0xfb/0x110 [ib_core]
[11384.486436] ib_attach_mcast+0x6f/0xa0 [ib_core]
[11384.486441] ipoib_mcast_attach+0x81/0x190 [ib_ipoib]
[11384.486443] ipoib_mcast_join_complete+0x354/0xb40 [ib_ipoib]
[11384.486448] mcast_work_handler+0x330/0x6c0 [ib_core]
[11384.486452] join_handler+0x101/0x220 [ib_core]
[11384.486455] ib_sa_mcmember_rec_callback+0x54/0x80 [ib_core]
[11384.486459] recv_handler+0x3a/0x60 [ib_core]
[11384.486462] ib_mad_recv_done+0x423/0x9b0 [ib_core]
[11384.486466] __ib_process_cq+0x5d/0xb0 [ib_core]
[11384.486469] ib_cq_poll_work+0x20/0x60 [ib_core]
[11384.486472] process_one_work+0x149/0x360
[11384.486474] worker_thread+0x4d/0x3c0
[11384.486487] kthread+0x109/0x140
[11384.486488] ? rescuer_thread+0x380/0x380
[11384.486489] ? kthread_park+0x60/0x60
[11384.486490] ? kthread_park+0x60/0x60
[11384.486493] ret_from_fork+0x25/0x30
[11384.486493] Code: Bad RIP value.
[11384.486493] Code: Bad RIP value.
[11384.486496] RIP: (null) RSP: ffffc90007fef970
[11384.486497] CR2: 0000000000000000
[11384.486531] ---[ end trace b1acec6fb4ff6e75 ]---
[11384.532133] Kernel panic - not syncing: Fatal exception
[11384.536541] Kernel Offset: disabled
[11384.969491] ---[ end Kernel panic - not syncing: Fatal exception
[11384.976875] sched: Unexpected reschedule of offline CPU#1!
[11384.983646] ------------[ cut here ]------------
Rdma device driver may not have implemented (*get_link_layer)()
so it can not be called directly. Should use appropriate helper function.
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Fixes: 5236333 ("IB/core: Fix the validations of a multicast LID in attach or detach operations")
Cc: stable@kernel.org # 4.13
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Netdev event notifier registration/de-registration is not synchronized with a lock and there is a possibility of a duplicate registration of notifier before the unregister completes. Register netdev event notifiers during module init and de-register them at module exit. This avoids the need to tie the registration to first netdev client interface open and de-registration to last client interface close and the synchronization to achieve it. This also fixes a crash due to duplicate registration. BUG: unable to handle kernel paging request at ffffffffa0d60388 IP: [<ffffffff8160f75d>] notifier_call_chain+0x3d/0x70 PGD 190d067 PUD 190e063 PMD 76c840067 PTE 0 Oops: 0000 [#1] SMP Modules linked in: i40e(OF-) fuse btrfs zlib_deflate raid6_pq xor vfat msdos [..] e1000e vxlan ip_tunnel ptp pps_core i2c_core video [last unloaded: i40iw] CPU: 1 PID: 27101 Comm: modprobe Tainted: GF W O-------------- 3.10.0-229.el7.x86_64 #1 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q87M-D2H, BIOS F7 01/17/2014 task: ffff88076e8a96c0 ti: ffff8806959c8000 task.ti: ffff8806959c8000 RIP: 0010:[<ffffffff8160f75d>] [<ffffffff8160f75d>] notifier_call_chain+0x3d/0x70 RSP: 0018:ffff8806959cbb38 EFLAGS: 00010282 RAX: ffffffffa0d60380 RBX: 00000000fffffffd RCX: 0000000000000000 0708] RDX: 0000000000000000 RSI: ffff88081227a000 RDI: 0000000000000002 RBP: ffff8806959cbb60 R08: 0000000000000246 R09: 000000000000700c R10: ffff88080e16ea40 R11: 00000000000ae8df R12: ffffffffa0d60380 R13: 0000000000000002 R14: ffff88076e738800 R15: 0000000000000000 FS: 00007f604ef4a740(0000) GS:ffff88083e240000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa0d60388 CR3: 0000000753cd2000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffff819e73a0 0000000000000000 0000000000000002 ffff88076e738800 00000000ffffffff ffff8806959cbba0 ffffffff8109d61d 0000000000000000 0000000000000000 ffff88076e738800 0000000000000000 ffff88076e738800 Call Trace: [<ffffffff8109d61d>] __blocking_notifier_call_chain+0x4d/0x70 [<ffffffff8109d656>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff8156b9e4>] __inet_del_ifa+0x154/0x2b0 [<ffffffff8156d102>] inetdev_event+0x182/0x530 [<ffffffff8160f76c>] notifier_call_chain+0x4c/0x70 [<ffffffff8109d446>] raw_notifier_call_chain+0x16/0x20 [<ffffffff814f71fd>] call_netdevice_notifiers+0x2d/0x60 [<ffffffff814f8845>] rollback_registered_many+0x105/0x220 [<ffffffff814f89a0>] rollback_registered+0x40/0x70 [<ffffffff814f9c88>] unregister_netdevice_queue+0x48/0x80 [<ffffffff814f9cdc>] unregister_netdev+0x1c/0x30 [<ffffffffa0067139>] i40e_vsi_release+0x2a9/0x2b0 [i40e] [<ffffffffa00674e8>] i40e_remove+0x128/0x2b0 [i40e] [<ffffffff813092db>] pci_device_remove+0x3b/0xb0 [<ffffffff813d26ef>] __device_release_driver+0x7f/0xf0 [<ffffffff813d3068>] driver_detach+0xb8/0xc0 [<ffffffff813d22db>] bus_remove_driver+0x9b/0x120 [<ffffffff813d36dc>] driver_unregister+0x2c/0x50 [<ffffffff81307d4c>] pci_unregister_driver+0x2c/0x90 [<ffffffffa008f9d0>] i40e_exit_module+0x10/0x23 [i40e] [<ffffffff810dad0b>] SyS_delete_module+0x16b/0x2d0 [<ffffffff81013b0c>] ? do_notify_resume+0x9c/0xb0 [<ffffffff81613da9>] system_call_fastpath+0x16/0x1b Code: e5 41 57 4d 89 c7 41 56 49 89 d6 41 55 49 89 f5 41 54 53 89 cb 75 14 eb 3d 0f 1f 44 00 00 83 eb 01 74 25 4d 85 e4 74 20 4c 89 e0 <4c> 8b 60 08 4c 89 f2 4c 89 ee 48 89 c7 ff 10 4d 85 ff 74 04 41 RIP [<ffffffff8160f75d>] notifier_call_chain+0x3d/0x70 Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
commit ab31fd0 upstream. v4.10 commit 6f2ce1c ("scsi: zfcp: fix rport unblock race with LUN recovery") extended accessing parent pointer fields of struct zfcp_erp_action for tracing. If an erp_action has never been enqueued before, these parent pointer fields are uninitialized and NULL. Examples are zfcp objects freshly added to the parent object's children list, before enqueueing their first recovery subsequently. In zfcp_erp_try_rport_unblock(), we iterate such list. Accessing erp_action fields can cause a NULL pointer dereference. Since the kernel can read from lowcore on s390, it does not immediately cause a kernel page fault. Instead it can cause hangs on trying to acquire the wrong erp_action->adapter->dbf->rec_lock in zfcp_dbf_rec_action_lvl() ^bogus^ while holding already other locks with IRQs disabled. Real life example from attaching lots of LUNs in parallel on many CPUs: crash> bt 17723 PID: 17723 TASK: ... CPU: 25 COMMAND: "zfcperp0.0.1800" LOWCORE INFO: -psw : 0x0404300180000000 0x000000000038e424 -function : _raw_spin_lock_wait_flags at 38e424 ... #0 [fdde8fc90] zfcp_dbf_rec_action_lvl at 3e0004e9862 [zfcp] #1 [fdde8fce8] zfcp_erp_try_rport_unblock at 3e0004dfddc [zfcp] l1k#2 [fdde8fd38] zfcp_erp_strategy at 3e0004e0234 [zfcp] l1k#3 [fdde8fda8] zfcp_erp_thread at 3e0004e0a12 [zfcp] l1k#4 [fdde8fe60] kthread at 173550 l1k#5 [fdde8feb8] kernel_thread_starter at 10add2 zfcp_adapter zfcp_port zfcp_unit <address>, 0x404040d600000000 scsi_device NULL, returning early! zfcp_scsi_dev.status = 0x40000000 0x40000000 ZFCP_STATUS_COMMON_RUNNING crash> zfcp_unit <address> struct zfcp_unit { erp_action = { adapter = 0x0, port = 0x0, unit = 0x0, }, } zfcp_erp_action is always fully embedded into its container object. Such container object is never moved in its object tree (only add or delete). Hence, erp_action parent pointers can never change. To fix the issue, initialize the erp_action parent pointers before adding the erp_action container to any list and thus before it becomes accessible from outside of its initializing function. In order to also close the time window between zfcp_erp_setup_act() memsetting the entire erp_action to zero and setting the parent pointers again, drop the memset and instead explicitly initialize individually all erp_action fields except for parent pointers. To be extra careful not to introduce any other unintended side effect, even keep zeroing the erp_action fields for list and timer. Also double-check with WARN_ON_ONCE that erp_action parent pointers never change, so we get to know when we would deviate from previous behavior. Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Fixes: 6f2ce1c ("scsi: zfcp: fix rport unblock race with LUN recovery") Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…-text symbols" commit 63be1a8 upstream. This reverts commit 83e840c ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols"). Chandan reported that on newer kernels, trying to enable function_graph tracer on ppc64 (BE) locks up the system with the following trace: Unable to handle kernel paging request for data at address 0x600000002fa30010 Faulting instruction address: 0xc0000000001f1300 Thread overran stack, or stack corrupted Oops: Kernel access of bad area, sig: 11 [#1] BE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries Modules linked in: CPU: 1 PID: 6586 Comm: bash Not tainted 4.14.0-rc3-00162-g6e51f1f-dirty #20 task: c000000625c07200 task.stack: c000000625c07310 NIP: c0000000001f1300 LR: c000000000121cac CTR: c000000000061af8 REGS: c000000625c088c0 TRAP: 0380 Not tainted (4.14.0-rc3-00162-g6e51f1f-dirty) MSR: 8000000000001032 <SF,ME,IR,DR,RI> CR: 28002848 XER: 00000000 CFAR: c0000000001f1320 SOFTE: 0 ... NIP [c0000000001f1300] .__is_insn_slot_addr+0x30/0x90 LR [c000000000121cac] .kernel_text_address+0x18c/0x1c0 Call Trace: [c000000625c08b40] [c0000000001bd040] .is_module_text_address+0x20/0x40 (unreliable) [c000000625c08bc0] [c000000000121cac] .kernel_text_address+0x18c/0x1c0 [c000000625c08c50] [c000000000061960] .prepare_ftrace_return+0x50/0x130 [c000000625c08cf0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34 [c000000625c08d60] [c000000000121b40] .kernel_text_address+0x20/0x1c0 [c000000625c08df0] [c000000000061960] .prepare_ftrace_return+0x50/0x130 ... [c000000625c0ab30] [c000000000061960] .prepare_ftrace_return+0x50/0x130 [c000000625c0abd0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34 [c000000625c0ac40] [c000000000121b40] .kernel_text_address+0x20/0x1c0 [c000000625c0acd0] [c000000000061960] .prepare_ftrace_return+0x50/0x130 [c000000625c0ad70] [c000000000061b10] .ftrace_graph_caller+0x14/0x34 [c000000625c0ade0] [c000000000121b40] .kernel_text_address+0x20/0x1c0 This is because ftrace is using ppc_function_entry() for obtaining the address of return_to_handler() in prepare_ftrace_return(). The call to kernel_text_address() itself gets traced and we end up in a recursive loop. Fixes: 83e840c ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols") Reported-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 624f5ab upstream. syzkaller reported a NULL pointer dereference in asn1_ber_decoder(). It can be reproduced by the following command, assuming CONFIG_PKCS7_TEST_KEY=y: keyctl add pkcs7_test desc '' @s The bug is that if the data buffer is empty, an integer underflow occurs in the following check: if (unlikely(dp >= datalen - 1)) goto data_overrun_error; This results in the NULL data pointer being dereferenced. Fix it by checking for 'datalen - dp < 2' instead. Also fix the similar check for 'dp >= datalen - n' later in the same function. That one possibly could result in a buffer overread. The NULL pointer dereference was reproducible using the "pkcs7_test" key type but not the "asymmetric" key type because the "asymmetric" key type checks for a 0-length payload before calling into the ASN.1 decoder but the "pkcs7_test" key type does not. The bug report was: BUG: unable to handle kernel NULL pointer dereference at (null) IP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 PGD 7b708067 P4D 7b708067 PUD 7b6ee067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 522 Comm: syz-executor1 Not tainted 4.14.0-rc8 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014 task: ffff9b6b3798c040 task.stack: ffff9b6b37970000 RIP: 0010:asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 RSP: 0018:ffff9b6b37973c78 EFLAGS: 00010216 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000021c RDX: ffffffff814a04ed RSI: ffffb1524066e000 RDI: ffffffff910759e0 RBP: ffff9b6b37973d60 R08: 0000000000000001 R09: ffff9b6b3caa4180 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f10ed1f2700(0000) GS:ffff9b6b3ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007b6f3000 CR4: 00000000000006f0 Call Trace: pkcs7_parse_message+0xee/0x240 crypto/asymmetric_keys/pkcs7_parser.c:139 verify_pkcs7_signature+0x33/0x180 certs/system_keyring.c:216 pkcs7_preparse+0x41/0x70 crypto/asymmetric_keys/pkcs7_key_type.c:63 key_create_or_update+0x180/0x530 security/keys/key.c:855 SYSC_add_key security/keys/keyctl.c:122 [inline] SyS_add_key+0xbf/0x250 security/keys/keyctl.c:62 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4585c9 RSP: 002b:00007f10ed1f1bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000f8 RAX: ffffffffffffffda RBX: 00007f10ed1f2700 RCX: 00000000004585c9 RDX: 0000000020000000 RSI: 0000000020008ffb RDI: 0000000020008000 RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff1b2260ae R13: 00007fff1b2260af R14: 00007f10ed1f2700 R15: 0000000000000000 Code: dd ca ff 48 8b 45 88 48 83 e8 01 4c 39 f0 0f 86 a8 07 00 00 e8 53 dd ca ff 49 8d 46 01 48 89 85 58 ff ff ff 48 8b 85 60 ff ff ff <42> 0f b6 0c 30 89 c8 88 8d 75 ff ff ff 83 e0 1f 89 8d 28 ff ff RIP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 RSP: ffff9b6b37973c78 CR2: 0000000000000000 Fixes: 42d5ec2 ("X.509: Add an ASN.1 decoder") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
… updates commit 38c53af upstream. Commit 5e98596 ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation", 2016-12-20) added code that tries to exclude any use or update of the hashed page table (HPT) while the HPT resizing code is iterating through all the entries in the HPT. It does this by taking the kvm->lock mutex, clearing the kvm->arch.hpte_setup_done flag and then sending an IPI to all CPUs in the host. The idea is that any VCPU task that tries to enter the guest will see that the hpte_setup_done flag is clear and therefore call kvmppc_hv_setup_htab_rma, which also takes the kvm->lock mutex and will therefore block until we release kvm->lock. However, any VCPU that is already in the guest, or is handling a hypervisor page fault or hypercall, can re-enter the guest without rechecking the hpte_setup_done flag. The IPI will cause a guest exit of any VCPUs that are currently in the guest, but does not prevent those VCPU tasks from immediately re-entering the guest. The result is that after resize_hpt_rehash_hpte() has made a HPTE absent, a hypervisor page fault can occur and make that HPTE present again. This includes updating the rmap array for the guest real page, meaning that we now have a pointer in the rmap array which connects with pointers in the old rev array but not the new rev array. In fact, if the HPT is being reduced in size, the pointer in the rmap array could point outside the bounds of the new rev array. If that happens, we can get a host crash later on such as this one: [91652.628516] Unable to handle kernel paging request for data at address 0xd0000000157fb10c [91652.628668] Faulting instruction address: 0xc0000000000e2640 [91652.628736] Oops: Kernel access of bad area, sig: 11 [#1] [91652.628789] LE SMP NR_CPUS=1024 NUMA PowerNV [91652.628847] Modules linked in: binfmt_misc vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas i2c_opal ipmi_powernv ipmi_devintf i2c_core ipmi_msghandler powernv_op_panel nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm_hv kvm_pr kvm scsi_dh_alua dm_service_time dm_multipath tg3 ptp pps_core [last unloaded: stap_552b612747aec2da355051e464fa72a1_14259] [91652.629566] CPU: 136 PID: 41315 Comm: CPU 21/KVM Tainted: G O 4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le #1 [91652.629684] task: c0000007a419e400 task.stack: c0000000028d8000 [91652.629750] NIP: c0000000000e2640 LR: d00000000c36e498 CTR: c0000000000e25f0 [91652.629829] REGS: c0000000028db5d0 TRAP: 0300 Tainted: G O (4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le) [91652.629932] MSR: 900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 44022422 XER: 00000000 [91652.630034] CFAR: d00000000c373f84 DAR: d0000000157fb10c DSISR: 40000000 SOFTE: 1 [91652.630034] GPR00: d00000000c36e498 c0000000028db850 c000000001403900 c0000007b7960000 [91652.630034] GPR04: d0000000117fb100 d000000007ab00d8 000000000033bb10 0000000000000000 [91652.630034] GPR08: fffffffffffffe7f 801001810073bb10 d00000000e440000 d00000000c373f70 [91652.630034] GPR12: c0000000000e25f0 c00000000fdb9400 f000000003b24680 0000000000000000 [91652.630034] GPR16: 00000000000004fb 00007ff7081a0000 00000000000ec91a 000000000033bb10 [91652.630034] GPR20: 0000000000010000 00000000001b1190 0000000000000001 0000000000010000 [91652.630034] GPR24: c0000007b7ab8038 d0000000117fb100 0000000ec91a1190 c000001e6a000000 [91652.630034] GPR28: 00000000033bb100 000000000073bb10 c0000007b7960000 d0000000157fb100 [91652.630735] NIP [c0000000000e2640] kvmppc_add_revmap_chain+0x50/0x120 [91652.630806] LR [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv] [91652.630884] Call Trace: [91652.630913] [c0000000028db850] [c0000000028db8b0] 0xc0000000028db8b0 (unreliable) [91652.630996] [c0000000028db8b0] [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv] [91652.631091] [c0000000028db9e0] [d00000000c36a078] kvmppc_vcpu_run_hv+0xdf8/0x1300 [kvm_hv] [91652.631179] [c0000000028dbb30] [d00000000c2248c4] kvmppc_vcpu_run+0x34/0x50 [kvm] [91652.631266] [c0000000028dbb50] [d00000000c220d54] kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm] [91652.631351] [c0000000028dbbd0] [d00000000c2139d8] kvm_vcpu_ioctl+0x598/0x7a0 [kvm] [91652.631433] [c0000000028dbd40] [c0000000003832e0] do_vfs_ioctl+0xd0/0x8c0 [91652.631501] [c0000000028dbde0] [c000000000383ba4] SyS_ioctl+0xd4/0x130 [91652.631569] [c0000000028dbe30] [c00000000000b8e0] system_call+0x58/0x6c [91652.631635] Instruction dump: [91652.631676] fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ffa1 2fa70000 793d0020 e9432110 [91652.631814] 7bbf26e4 7c7e1b78 7feafa14 409e0094 <807f000c> 786326e4 7c6a1a14 93a40008 [91652.631959] ---[ end trace ac85ba6db72e5b2e ]--- To fix this, we tighten up the way that the hpte_setup_done flag is checked to ensure that it does provide the guarantee that the resizing code needs. In kvmppc_run_core(), we check the hpte_setup_done flag after disabling interrupts and refuse to enter the guest if it is clear (for a HPT guest). The code that checks hpte_setup_done and calls kvmppc_hv_setup_htab_rma() is moved from kvmppc_vcpu_run_hv() to a point inside the main loop in kvmppc_run_vcpu(), ensuring that we don't just spin endlessly calling kvmppc_run_core() while hpte_setup_done is clear, but instead have a chance to block on the kvm->lock mutex. Finally we also check hpte_setup_done inside the region in kvmppc_book3s_hv_page_fault() where the HPTE is locked and we are about to update the HPTE, and bail out if it is clear. If another CPU is inside kvm_vm_ioctl_resize_hpt_commit) and has cleared hpte_setup_done, then we know that either we are looking at a HPTE that resize_hpt_rehash_hpte() has not yet processed, which is OK, or else we will see hpte_setup_done clear and refuse to update it, because of the full barrier formed by the unlock of the HPTE in resize_hpt_rehash_hpte() combined with the locking of the HPTE in kvmppc_book3s_hv_page_fault(). Fixes: 5e98596 ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation") Reported-by: Satheesh Rajendran <satheera@in.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6151b8b ] ppp_release() tries to ensure that netdevices are unregistered before decrementing the unit refcount and running ppp_destroy_interface(). This is all fine as long as the the device is unregistered by ppp_release(): the unregister_netdevice() call, followed by rtnl_unlock(), guarantee that the unregistration process completes before rtnl_unlock() returns. However, the device may be unregistered by other means (like ppp_nl_dellink()). If this happens right before ppp_release() calling rtnl_lock(), then ppp_release() has to wait for the concurrent unregistration code to release the lock. But rtnl_unlock() releases the lock before completing the device unregistration process. This allows ppp_release() to proceed and eventually call ppp_destroy_interface() before the unregistration process completes. Calling free_netdev() on this partially unregistered device will BUG(): ------------[ cut here ]------------ kernel BUG at net/core/dev.c:8141! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 1557 Comm: pppd Not tainted 4.14.0-rc2+ l1k#4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 Call Trace: ppp_destroy_interface+0xd8/0xe0 [ppp_generic] ppp_disconnect_channel+0xda/0x110 [ppp_generic] ppp_unregister_channel+0x5e/0x110 [ppp_generic] pppox_unbind_sock+0x23/0x30 [pppox] pppoe_connect+0x130/0x440 [pppoe] SYSC_connect+0x98/0x110 ? do_fcntl+0x2c0/0x5d0 SyS_connect+0xe/0x10 entry_SYSCALL_64_fastpath+0x1a/0xa5 RIP: free_netdev+0x107/0x110 RSP: ffffc28a40573d88 ---[ end trace ed294ff0cc40eeff ]--- We could set the ->needs_free_netdev flag on PPP devices and move the ppp_destroy_interface() logic in the ->priv_destructor() callback. But that'd be quite intrusive as we'd first need to unlink from the other channels and units that depend on the device (the ones that used the PPPIOCCONNECT and PPPIOCATTACH ioctls). Instead, we can just let the netdevice hold a reference on its ppp_file. This reference is dropped in ->priv_destructor(), at the very end of the unregistration process, so that neither ppp_release() nor ppp_disconnect_channel() can call ppp_destroy_interface() in the interim. Reported-by: Beniamino Galvani <bgalvani@redhat.com> Fixes: 8cb775b ("ppp: fix device unregistration upon netns deletion") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 864e2a1 ] When syzkaller team brought us a C repro for the crash [1] that had been reported many times in the past, I finally could find the root cause. If FlowLabel info is merged by fl6_merge_options(), we leave part of the opt_space storage provided by udp/raw/l2tp with random value in opt_space.tot_len, unless a control message was provided at sendmsg() time. Then ip6_setup_cork() would use this random value to perform a kzalloc() call. Undefined behavior and crashes. Fix is to properly set tot_len in fl6_merge_options() At the same time, we can also avoid consuming memory and cpu cycles to clear it, if every option is copied via a kmemdup(). This is the change in ip6_setup_cork(). [1] kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 6613 Comm: syz-executor0 Not tainted 4.14.0-rc4+ #127 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff8801cb64a100 task.stack: ffff8801cc350000 RIP: 0010:ip6_setup_cork+0x274/0x15c0 net/ipv6/ip6_output.c:1168 RSP: 0018:ffff8801cc357550 EFLAGS: 00010203 RAX: dffffc0000000000 RBX: ffff8801cc357748 RCX: 0000000000000010 RDX: 0000000000000002 RSI: ffffffff842bd1d9 RDI: 0000000000000014 RBP: ffff8801cc357620 R08: ffff8801cb17f380 R09: ffff8801cc357b10 R10: ffff8801cb64a100 R11: 0000000000000000 R12: ffff8801cc357ab0 R13: ffff8801cc357b10 R14: 0000000000000000 R15: ffff8801c3bbf0c0 FS: 00007f9c5c459700(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020324000 CR3: 00000001d1cf2000 CR4: 00000000001406f0 DR0: 0000000020001010 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Call Trace: ip6_make_skb+0x282/0x530 net/ipv6/ip6_output.c:1729 udpv6_sendmsg+0x2769/0x3380 net/ipv6/udp.c:1340 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x358/0x5a0 net/socket.c:1750 SyS_sendto+0x40/0x50 net/socket.c:1718 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4520a9 RSP: 002b:00007f9c5c458c08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 00000000004520a9 RDX: 0000000000000001 RSI: 0000000020fd1000 RDI: 0000000000000016 RBP: 0000000000000086 R08: 0000000020e0afe4 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004bb1ee R13: 00000000ffffffff R14: 0000000000000016 R15: 0000000000000029 Code: e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 ea 0f 00 00 48 8d 79 04 48 b8 00 00 00 00 00 fc ff df 45 8b 74 24 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 RIP: ip6_setup_cork+0x274/0x15c0 net/ipv6/ip6_output.c:1168 RSP: ffff8801cc357550 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb0c199 upstream. dvb_detach(arg) calls symbol_put_addr(arg), where arg should be a pointer to a function. Right now a pointer to state->dib7000p_ops is passed to dvb_detach(), which causes a BUG() in symbol_put_addr() as discovered by syzkaller. Pass state->dib7000p_ops.set_wbd_ref instead. ------------[ cut here ]------------ kernel BUG at kernel/module.c:1081! invalid opcode: 0000 [#1] PREEMPT SMP KASAN Modules linked in: CPU: 1 PID: 1151 Comm: kworker/1:1 Tainted: G W 4.14.0-rc1-42251-gebb2c2437d80 #224 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: usb_hub_wq hub_event task: ffff88006a336300 task.stack: ffff88006a7c8000 RIP: 0010:symbol_put_addr+0x54/0x60 kernel/module.c:1083 RSP: 0018:ffff88006a7ce210 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880062a8d190 RCX: 0000000000000000 RDX: dffffc0000000020 RSI: ffffffff85876d60 RDI: ffff880062a8d190 RBP: ffff88006a7ce218 R08: 1ffff1000d4f9c12 R09: 1ffff1000d4f9ae4 R10: 1ffff1000d4f9bed R11: 0000000000000000 R12: ffff880062a8d180 R13: 00000000ffffffed R14: ffff880062a8d190 R15: ffff88006947c000 FS: 0000000000000000(0000) GS:ffff88006c900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6416532000 CR3: 00000000632f5000 CR4: 00000000000006e0 Call Trace: stk7070p_frontend_attach+0x515/0x610 drivers/media/usb/dvb-usb/dib0700_devices.c:1013 dvb_usb_adapter_frontend_init+0x32b/0x660 drivers/media/usb/dvb-usb/dvb-usb-dvb.c:286 dvb_usb_adapter_init drivers/media/usb/dvb-usb/dvb-usb-init.c:86 dvb_usb_init drivers/media/usb/dvb-usb/dvb-usb-init.c:162 dvb_usb_device_init+0xf70/0x17f0 drivers/media/usb/dvb-usb/dvb-usb-init.c:277 dib0700_probe+0x171/0x5a0 drivers/media/usb/dvb-usb/dib0700_core.c:886 usb_probe_interface+0x35d/0x8e0 drivers/usb/core/driver.c:361 really_probe drivers/base/dd.c:413 driver_probe_device+0x610/0xa00 drivers/base/dd.c:557 __device_attach_driver+0x230/0x290 drivers/base/dd.c:653 bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463 __device_attach+0x26e/0x3d0 drivers/base/dd.c:710 device_initial_probe+0x1f/0x30 drivers/base/dd.c:757 bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523 device_add+0xd0b/0x1660 drivers/base/core.c:1835 usb_set_configuration+0x104e/0x1870 drivers/usb/core/message.c:1932 generic_probe+0x73/0xe0 drivers/usb/core/generic.c:174 usb_probe_device+0xaf/0xe0 drivers/usb/core/driver.c:266 really_probe drivers/base/dd.c:413 driver_probe_device+0x610/0xa00 drivers/base/dd.c:557 __device_attach_driver+0x230/0x290 drivers/base/dd.c:653 bus_for_each_drv+0x161/0x210 drivers/base/bus.c:463 __device_attach+0x26e/0x3d0 drivers/base/dd.c:710 device_initial_probe+0x1f/0x30 drivers/base/dd.c:757 bus_probe_device+0x1eb/0x290 drivers/base/bus.c:523 device_add+0xd0b/0x1660 drivers/base/core.c:1835 usb_new_device+0x7b8/0x1020 drivers/usb/core/hub.c:2457 hub_port_connect drivers/usb/core/hub.c:4903 hub_port_connect_change drivers/usb/core/hub.c:5009 port_event drivers/usb/core/hub.c:5115 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119 worker_thread+0x221/0x1850 kernel/workqueue.c:2253 kthread+0x3a1/0x470 kernel/kthread.c:231 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 Code: ff ff 48 85 c0 74 24 48 89 c7 e8 48 ea ff ff bf 01 00 00 00 e8 de 20 e3 ff 65 8b 05 b7 2f c2 7e 85 c0 75 c9 e8 f9 0b c1 ff eb c2 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 b8 00 00 RIP: symbol_put_addr+0x54/0x60 RSP: ffff88006a7ce210 ---[ end trace b75b357739e7e116 ]--- Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 199512b upstream. If 'p' is 0 for the software Diffie-Hellman implementation, then dh_max_size() returns 0. In the case of KEYCTL_DH_COMPUTE, this causes ZERO_SIZE_PTR to be passed to sg_init_one(), which with CONFIG_DEBUG_SG=y triggers the 'BUG_ON(!virt_addr_valid(buf));' in sg_set_buf(). Fix this by making crypto_dh_decode_key() reject 0 for 'p'. p=0 makes no sense for any DH implementation because 'p' is supposed to be a prime number. Moreover, 'mod 0' is not mathematically defined. Bug report: kernel BUG at ./include/linux/scatterlist.h:140! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 27112 Comm: syz-executor2 Not tainted 4.14.0-rc7-00010-gf5dbb5d0ce32-dirty #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014 task: ffff88006caac0c0 task.stack: ffff88006c7c8000 RIP: 0010:sg_set_buf include/linux/scatterlist.h:140 [inline] RIP: 0010:sg_init_one+0x1b3/0x240 lib/scatterlist.c:156 RSP: 0018:ffff88006c7cfb08 EFLAGS: 00010216 RAX: 0000000000010000 RBX: ffff88006c7cfe30 RCX: 00000000000064ee RDX: ffffffff81cf64c3 RSI: ffffc90000d72000 RDI: ffffffff92e937e0 RBP: ffff88006c7cfb30 R08: ffffed000d8f9fab R09: ffff88006c7cfd30 R10: 0000000000000005 R11: ffffed000d8f9faa R12: ffff88006c7cfd30 R13: 0000000000000000 R14: 0000000000000010 R15: ffff88006c7cfc50 FS: 00007fce190fa700(0000) GS:ffff88003ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fffc6b33db8 CR3: 000000003cf64000 CR4: 00000000000006f0 Call Trace: __keyctl_dh_compute+0xa95/0x19b0 security/keys/dh.c:360 keyctl_dh_compute+0xac/0x100 security/keys/dh.c:434 SYSC_keyctl security/keys/keyctl.c:1745 [inline] SyS_keyctl+0x72/0x2c0 security/keys/keyctl.c:1641 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4585c9 RSP: 002b:00007fce190f9bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000fa RAX: ffffffffffffffda RBX: 0000000000738020 RCX: 00000000004585c9 RDX: 000000002000d000 RSI: 0000000020000ff4 RDI: 0000000000000017 RBP: 0000000000000046 R08: 0000000020008000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff6e610cde R13: 00007fff6e610cdf R14: 00007fce190fa700 R15: 0000000000000000 Code: 03 0f b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 33 5b 45 89 6c 24 14 41 5c 41 5d 41 5e 41 5f 5d c3 e8 fd 8f 68 ff <0f> 0b e8 f6 8f 68 ff 0f 0b e8 ef 8f 68 ff 0f 0b e8 e8 8f 68 ff 20 RIP: sg_set_buf include/linux/scatterlist.h:140 [inline] RSP: ffff88006c7cfb08 RIP: sg_init_one+0x1b3/0x240 lib/scatterlist.c:156 RSP: ffff88006c7cfb08 Fixes: 802c7f1 ("crypto: dh - Add DH software implementation") Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7fd0783 ] A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will cause a divide error in usbnet_probe: divide error: 0000 [#1] PREEMPT SMP KASAN Modules linked in: CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f #56 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: usb_hub_wq hub_event task: ffff88006bef5c00 task.stack: ffff88006bf60000 RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355 RSP: 0018:ffff88006bf67508 EFLAGS: 00010246 RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34 RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34 RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881 R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003 R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0 Call Trace: usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783 qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338 usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361 really_probe drivers/base/dd.c:413 driver_probe_device+0x522/0x740 drivers/base/dd.c:557 Fix by simply ignoring the bogus descriptor, as it is optional for QMI devices anyway. Fixes: 423ce8c ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0de0add ] When we receive a packet on a QMI device in raw IP mode, we should call skb_reset_mac_header() to ensure that skb->mac_header contains a valid offset in the packet. While it shouldn't really matter, the packets have no MAC header and the interface is configured as-such, it seems certain parts of the network stack expects a "good" value in skb->mac_header. Without the skb_reset_mac_header() call added in this patch, for example shaping traffic (using tc) triggers the following oops on the first received packet: [ 303.642957] skbuff: skb_under_panic: text:8f137918 len:177 put:67 head:8e4b0f00 data:8e4b0eff tail:0x8e4b0fb0 end:0x8e4b1520 dev:wwan0 [ 303.655045] Kernel bug detected[#1]: [ 303.658622] CPU: 1 PID: 1002 Comm: logd Not tainted 4.9.58 #0 [ 303.664339] task: 8fdf05e0 task.stack: 8f15c000 [ 303.668844] $ 0 : 00000000 00000001 0000007a 00000000 [ 303.674062] $ 4 : 8149a2fc 8149a2fc 8149ce20 00000000 [ 303.679284] $ 8 : 00000030 3878303a 31623465 20303235 [ 303.684510] $12 : ded731e3 2626a277 00000000 03bd0000 [ 303.689747] $16 : 8ef62b40 00000043 8f137918 804db5fc [ 303.694978] $20 : 00000001 00000004 8fc13800 00000003 [ 303.700215] $24 : 00000001 8024ab10 [ 303.705442] $28 : 8f15c000 8fc19cf0 00000043 802cc920 [ 303.710664] Hi : 00000000 [ 303.713533] Lo : 74e58000 [ 303.716436] epc : 802cc920 skb_panic+0x58/0x5c [ 303.721046] ra : 802cc920 skb_panic+0x58/0x5c [ 303.725639] Status: 11007c03 KERNEL EXL IE [ 303.729823] Cause : 50800024 (ExcCode 09) [ 303.733817] PrId : 0001992f (MIPS 1004Kc) [ 303.737892] Modules linked in: rt2800pci rt2800mmio rt2800lib qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 mt76x2i Process logd (pid: 1002, threadinfo=8f15c000, task=8fdf05e0, tls=77b3eee4) [ 303.962509] Stack : 00000000 80408990 8f137918 000000b1 00000043 8e4b0f00 8e4b0eff 8e4b0fb0 [ 303.970871] 8e4b1520 8fec1800 00000043 802cd2a4 6e000045 00000043 00000000 8ef62000 [ 303.979219] 8eef5d00 8ef62b40 8fea7300 8f137918 00000000 00000000 0002bb01 793e5664 [ 303.987568] 8ef08884 00000001 8fea7300 00000002 8fc19e80 8eef5d00 00000006 00000003 [ 303.995934] 00000000 8030ba90 00000003 77ab3fd0 8149dc80 8004d1bc 8f15c000 8f383700 [ 304.004324] ... [ 304.006767] Call Trace: [ 304.009241] [<802cc920>] skb_panic+0x58/0x5c [ 304.013504] [<802cd2a4>] skb_push+0x78/0x90 [ 304.017783] [<8f137918>] 0x8f137918 [ 304.021269] Code: 0060282 0c02a3b4 24842888 <000c000d> 8c870060 8c8200a0 0007382b 00070336 8c88005c [ 304.031034] [ 304.032805] ---[ end trace b778c482b3f0bda9 ]--- [ 304.041384] Kernel panic - not syncing: Fatal exception in interrupt [ 304.051975] Rebooting in 3 seconds.. While the oops is for a 4.9-kernel, I was able to trigger the same oops with net-next as of yesterday. Fixes: 32f7adf ("net: qmi_wwan: support "raw IP" mode") Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
Now that all prerequisites are merged into |
|
Sorry, this is one of a few things I've postponed until xmas holidays (i.e., now) due to the usual end-of-year chaos at the office, please bear with me as I go through my backlog the coming two weeks. Specifically I've realized that we're not enabling runtime PM because we're bailing out of |
|
Regarding your comment of Nov 8: I only had an older acpidump of the MBP13,3 (MBP133.88Z.0226.B00.1610231055) and the BLTH device is missing the _PRW method there. I downloaded the firmware volume of your BIOS version MBP133.88Z.0226.B23.1704201604, extracted and disassembled the DSDT and yes indeed, _PRW is present there and specifies the SMC's GPE as wake event. If _PRW is found on the Bluetooth ACPI device, the ACPI core automatically enables wakeup for the physical companion device once it's registered. So I've added As for the clock rate, the code looks like it should be sufficient to only set Updated patch is on my hci_bcm_v1 branch but I still need to figure out the runtime PM stuff. Thanks a lot for taking the time to debug wakeup and baudrate. |
|
Alright, I've amended the commit to move the runtime PM code to a different location so that it gets executed on Macs and added another commit to streamline the runtime PM code a bit. Both are on my hci_bcm_v1 branch, the commit ids are currently 5e30ec7ead67 and 4efe2fb5fc55. Could someone test this? @roadrunner2 @peterychuang |
|
I've just tested the two commits on top the latest linux-next. Functionally it works as before, though I notice stuff like the following in the dmesg after using it for a bit: I don't think I've seen these things before, though in fairness I don't use Bluetooth very often, so they might have been there before, and I just didn't know. In any case, the patches work, so perhaps the junk messages above don't matter. |
|
So, after finally figuring out which kernel modules to enable (I had to enable |
|
The I'm wondering if this can be attributed to unstable stuff that is currently in linux-next. Maybe try with 4.15-rc4 or Linus' current tree to see if the messages persist? Thanks for testing guys! |
|
@l1k: the I agree It may be something werid on linux-next. I'll test again when the next linux-next comes out. And thanks for your work! |
The main two issues were the timing out of operations due to wrong baudrate (2nd commit) and not finding the UART (3rd commit). Things work reasonably well with this, though I still occasionally see a timeout on the first command, necessitating a re-attach of the tty.
I still see the following error when attaching the tty:
But it is benign, as the UART device does not appear to need its baud rate changed.
Finally, the 4th commit fixes the following two BUG's: